Bank Churn Prediction

Problem Statement

Context

Businesses like banks which provide service have to worry about problem of 'Customer Churn' i.e. customers leaving and joining another service provider. It is important to understand which aspects of the service influence a customer's decision in this regard. Management can concentrate efforts on improvement of service, keeping in mind these priorities.

Objective

You as a Data scientist with the bank need to build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months.

Data Dictionary

  • CustomerId: Unique ID which is assigned to each customer

  • Surname: Last name of the customer

  • CreditScore: It defines the credit history of the customer.

  • Geography: A customer’s location

  • Gender: It defines the Gender of the customer

  • Age: Age of the customer

  • Tenure: Number of years for which the customer has been with the bank

  • NumOfProducts: refers to the number of products that a customer has purchased through the bank.

  • Balance: Account balance

  • HasCrCard: It is a categorical variable which decides whether the customer has credit card or not.

  • EstimatedSalary: Estimated salary

  • isActiveMember: Is is a categorical variable which decides whether the customer is active member of the bank or not ( Active member in the sense, using bank products regularly, making transactions etc )

  • Exited : whether or not the customer left the bank within six month. It can take two values 0=No ( Customer did not leave the bank ) 1=Yes ( Customer left the bank )

Importing necessary libraries

In [1]:
# Installing the libraries with the specified version.
# !pip install tensorflow==2.15.0 scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==2.0.3 imbalanced-learn==0.10.1 -q --user
In [2]:
# Libraries to help with reading and manipulating data
import pandas as pd
import numpy as np

import time

# libaries to help with data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Library to split data
from sklearn.model_selection import train_test_split

# library to import to standardize the data
from sklearn.preprocessing import StandardScaler, LabelEncoder

# importing different functions to build models
import tensorflow as tf
from tensorflow import keras
from keras import backend
from keras.models import Sequential
from keras.layers import Dense, Dropout

from tensorflow.keras.layers import BatchNormalization

# importing SMOTE
from imblearn.over_sampling import SMOTE

# importing metrics
from sklearn.metrics import confusion_matrix,roc_curve,classification_report,recall_score

import random

# Library to avoid the warnings
import warnings
warnings.filterwarnings("ignore")

Loading the dataset

In [3]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [4]:
churn = pd.read_csv('/content/drive/MyDrive/Python Course/Churn.csv')

# Make a copy of the data
ds = churn.copy()

Data Overview

View the first and last 5 rows of the dataset.

In [5]:
# let's view the first 5 rows of the data
ds.head()
Out[5]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1 1 101348.88 1
1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0 1 112542.58 0
2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1 0 113931.57 1
3 4 15701354 Boni 699 France Female 39 1 0.00 2 0 0 93826.63 0
4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1 1 79084.10 0
In [6]:
# let's view the last 5 rows of the data
ds.tail()
Out[6]:
RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
9995 9996 15606229 Obijiaku 771 France Male 39 5 0.00 2 1 0 96270.64 0
9996 9997 15569892 Johnstone 516 France Male 35 10 57369.61 1 1 1 101699.77 0
9997 9998 15584532 Liu 709 France Female 36 7 0.00 1 0 1 42085.58 1
9998 9999 15682355 Sabbatini 772 Germany Male 42 3 75075.31 2 1 0 92888.52 1
9999 10000 15628319 Walker 792 France Female 28 4 130142.79 1 1 0 38190.78 0

Understand the shape of the dataset

In [7]:
# Checking the number of rows and columns in the training data
ds.shape
Out[7]:
(10000, 14)
  • There are 10000 rows and 14 columns in the dataset.

Check the data types of the columns for the dataset

In [8]:
ds.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   RowNumber        10000 non-null  int64  
 1   CustomerId       10000 non-null  int64  
 2   Surname          10000 non-null  object 
 3   CreditScore      10000 non-null  int64  
 4   Geography        10000 non-null  object 
 5   Gender           10000 non-null  object 
 6   Age              10000 non-null  int64  
 7   Tenure           10000 non-null  int64  
 8   Balance          10000 non-null  float64
 9   NumOfProducts    10000 non-null  int64  
 10  HasCrCard        10000 non-null  int64  
 11  IsActiveMember   10000 non-null  int64  
 12  EstimatedSalary  10000 non-null  float64
 13  Exited           10000 non-null  int64  
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB
  • There are 11 numeric columns and 3 object columns.

Checking the Statistical Summary

In [9]:
ds.describe()
Out[9]:
RowNumber CustomerId CreditScore Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited
count 10000.00000 1.000000e+04 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.00000 10000.000000 10000.000000 10000.000000
mean 5000.50000 1.569094e+07 650.528800 38.921800 5.012800 76485.889288 1.530200 0.70550 0.515100 100090.239881 0.203700
std 2886.89568 7.193619e+04 96.653299 10.487806 2.892174 62397.405202 0.581654 0.45584 0.499797 57510.492818 0.402769
min 1.00000 1.556570e+07 350.000000 18.000000 0.000000 0.000000 1.000000 0.00000 0.000000 11.580000 0.000000
25% 2500.75000 1.562853e+07 584.000000 32.000000 3.000000 0.000000 1.000000 0.00000 0.000000 51002.110000 0.000000
50% 5000.50000 1.569074e+07 652.000000 37.000000 5.000000 97198.540000 1.000000 1.00000 1.000000 100193.915000 0.000000
75% 7500.25000 1.575323e+07 718.000000 44.000000 7.000000 127644.240000 2.000000 1.00000 1.000000 149388.247500 0.000000
max 10000.00000 1.581569e+07 850.000000 92.000000 10.000000 250898.090000 4.000000 1.00000 1.000000 199992.480000 1.000000
  • The average credit score of customers is ~651 and the median is similar at ~652. 25% of customers have credit scores above 718.
  • The average age of customers is ~39 years. The median age is 37 years. 25% of the customers have ages above 44 years. The max age is 92 years.
  • The average custmoer tenure is ~5 years, which is similar to the median of 5.
  • The average customer balance is ~76500. 25% of customers have a balance of 0.
  • The average number of products of customers is ~1.5. The median is 1 and 25% of customers have ore than 2 products.
  • Half of the customers have credit cards and are active members.
  • The average and median estimated customer salary are ~100K. The min salary is about 12 dollars, which is probably a mistake. The max is ~200K dollars.

Checking for Missing Values

In [10]:
# let's check for missing values in the data
ds.isna().sum()
Out[10]:
0
RowNumber 0
CustomerId 0
Surname 0
CreditScore 0
Geography 0
Gender 0
Age 0
Tenure 0
Balance 0
NumOfProducts 0
HasCrCard 0
IsActiveMember 0
EstimatedSalary 0
Exited 0

  • There are no null values.

Checking for Duplicate Values

In [11]:
ds.duplicated().sum()
Out[11]:
0
  • There are no duplicate values.

Checking for unique values for each of the column

In [12]:
ds.nunique()
Out[12]:
0
RowNumber 10000
CustomerId 10000
Surname 2932
CreditScore 460
Geography 3
Gender 2
Age 70
Tenure 11
Balance 6382
NumOfProducts 4
HasCrCard 2
IsActiveMember 2
EstimatedSalary 9999
Exited 2

  • RowNumber and CustomerId are unique, hence should be dropped.
  • Surname will not be useful also, so it should be dropped.
In [13]:
# Drop RowNumber, CustomerId and Surname
ds = ds.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)
  • Preview some more of the data
In [14]:
for i in ds.describe(include=["object"]).columns:
    print("Unique values in", i, "are :")
    print(ds[i].value_counts())
    print("*" * 50)
Unique values in Geography are :
Geography
France     5014
Germany    2509
Spain      2477
Name: count, dtype: int64
**************************************************
Unique values in Gender are :
Gender
Male      5457
Female    4543
Name: count, dtype: int64
**************************************************

Exploratory Data Analysis

Univariate Analysis

In [15]:
# function to plot a boxplot and a histogram along the same scale.


def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
    """
    Boxplot and histogram combined

    data: dataframe
    feature: dataframe column
    figsize: size of figure (default (12,7))
    kde: whether to show the density curve (default False)
    bins: number of bins for histogram (default None)
    """
    f2, (ax_box2, ax_hist2) = plt.subplots(
        nrows=2,  # Number of rows of the subplot grid= 2
        sharex=True,  # x-axis will be shared among all subplots
        gridspec_kw={"height_ratios": (0.25, 0.75)},
        figsize=figsize,
    )  # creating the 2 subplots
    sns.boxplot(
        data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
    )  # boxplot will be created and a star will indicate the mean value of the column
    sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
    ) if bins else sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2
    )  # For histogram
    ax_hist2.axvline(
        data[feature].mean(), color="green", linestyle="--"
    )  # Add mean to the histogram
    ax_hist2.axvline(
        data[feature].median(), color="black", linestyle="-"
    )  # Add median to the histogram
In [16]:
# function to create labeled barplots


def labeled_barplot(data, feature, perc=False, n=None):
    """
    Barplot with percentage at the top

    data: dataframe
    feature: dataframe column
    perc: whether to display percentages instead of count (default is False)
    n: displays the top n category levels (default is None, i.e., display all levels)
    """

    total = len(data[feature])  # length of the column
    count = data[feature].nunique()
    if n is None:
        plt.figure(figsize=(count + 1, 5))
    else:
        plt.figure(figsize=(n + 1, 5))

    plt.xticks(rotation=90, fontsize=15)
    ax = sns.countplot(
        data=data,
        x=feature,
        palette="Paired",
        order=data[feature].value_counts().index[:n].sort_values(),
    )

    for p in ax.patches:
        if perc == True:
            label = "{:.1f}%".format(
                100 * p.get_height() / total
            )  # percentage of each class of the category
        else:
            label = p.get_height()  # count of each level of the category

        x = p.get_x() + p.get_width() / 2  # width of the plot
        y = p.get_height()  # height of the plot

        ax.annotate(
            label,
            (x, y),
            ha="center",
            va="center",
            size=12,
            xytext=(0, 5),
            textcoords="offset points",
        )  # annotate the percentage

    plt.show()  # show the plot

Observations on CreditScore

In [17]:
histogram_boxplot(ds,'CreditScore', kde=True)
  • The distribution of Credit Score for customers is mostly normal, with just a litte bit of left skewness.
  • The mean and median are similar at about 651.
  • There are outliers on the lower side of the Credit Score data.

Observations on Age

In [18]:
histogram_boxplot(ds, 'Age', kde=True)
  • The distribution of Age of customers is right-skewed.
  • The median and mean at 37 and about 39 years, respectively.
  • The are outliers on the higher side of the data.

Observations on Balance

In [19]:
histogram_boxplot(ds, 'Balance', kde=True)
  • 25% of the customers have a minimum balance of 0. Due to that fact, the data distribution initially, is to the left but for the rest of the data, the distribution of Balance is normal.
  • The median and mean at ~97K and about 76K, respectively.

Observations on Estimated Salary

In [20]:
histogram_boxplot(ds, 'EstimatedSalary', kde=True)
  • The distribution of Estimated Salary is uniform.
  • The median and mean are similar at about 100K.

Observations on Exited

In [21]:
labeled_barplot(ds, "Exited", perc=True)
  • ~80% of the customers did not exit, while ~20% exited.

Observations on Geography

In [22]:
labeled_barplot(ds, 'Geography', perc=True)
  • Half of the customers are from France, while a quarter are from Germany and the remaining quarter from Spain.

Observations on Gender

In [23]:
labeled_barplot(ds, 'Gender', perc=True)
  • ~45% of the customers are female, while ~55% are male.

Observations on Tenure

In [24]:
labeled_barplot(ds, 'Tenure', perc=True)
  • Customers that have been with the company from 1 to 9 years make up majority of the customers at a combined 91%, with each of the individual year categories at between about 10-11%.
  • Customers that have been with the company from 0 years are the least a ~4% and customers that hava a tenure of 10 years are at ~5%.

Observations on Number of Products

In [25]:
labeled_barplot(ds, 'NumOfProducts', perc=True)
  • Customers that had only one product with the compan were the most at ~51%, followed by customers with two products at ~46%, customers with 3 products at ~3% and customers at below 1%.

Observations on Has Credit Card

In [26]:
labeled_barplot(ds, 'HasCrCard', perc=True)
  • ~29% of customers have no credit cards while ~71% have credit cards.

Observations on Is Active Member

In [27]:
labeled_barplot(ds, 'IsActiveMember', perc=True)
  • 48.5% of the customers are not active members, while 51.5% are active members.

Bivariate Analysis

In [28]:
# function to plot stacked bar chart


def stacked_barplot(data, predictor, target):
    """
    Print the category counts and plot a stacked bar chart

    data: dataframe
    predictor: independent variable
    target: target variable
    """
    count = data[predictor].nunique()
    sorter = data[target].value_counts().index[-1]
    tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
        by=sorter, ascending=False
    )
    print(tab1)
    print("-" * 120)
    tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
        by=sorter, ascending=False
    )
    tab.plot(kind="bar", stacked=True, figsize=(count + 1, 5))
    plt.legend(
        loc="lower left",
        frameon=False,
    )
    plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
    plt.show()
In [29]:
### Function to plot distributions

def distribution_plot_wrt_target(data, predictor, target):

    fig, axs = plt.subplots(2, 2, figsize=(12, 10))

    target_uniq = data[target].unique()
    target_uniq.sort()

    axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
    sns.histplot(
        data=data[data[target] == target_uniq[0]],
        x=predictor,
        kde=True,
        ax=axs[0, 0],
        color="teal",
    )

    axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
    sns.histplot(
        data=data[data[target] == target_uniq[1]],
        x=predictor,
        kde=True,
        ax=axs[0, 1],
        color="orange",
    )

    axs[1, 0].set_title("Boxplot w.r.t target")
    sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")

    axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
    sns.boxplot(
        data=data,
        x=target,
        y=predictor,
        ax=axs[1, 1],
        showfliers=False,
        palette="gist_rainbow",
    )

    plt.tight_layout()
    plt.show()

Correlation plot

In [30]:
# defining the list of numerical columns
cols_list = ["CreditScore","Age","Tenure","Balance","EstimatedSalary", "NumOfProducts"]
In [ ]:
# Pairplot with hue=Exited
sns.pairplot(ds, vars=cols_list, hue='Exited', diag_kind='kde')
#plt.show()
Out[ ]:
<seaborn.axisgrid.PairGrid at 0x7f34b71d69b0>
  • Customers with ages between about 40 to about 70, exited more.
  • Customers with 3-4 number of products, exited more.
In [ ]:
plt.figure(figsize=(15, 7))
sns.heatmap(ds[cols_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.show()
  • There is a negative correlationship between Balance and NumOfProducts.

Exited Vs Geography

In [ ]:
stacked_barplot(ds, "Geography", "Exited" )
Exited        0     1    All
Geography                   
All        7963  2037  10000
Germany    1695   814   2509
France     4204   810   5014
Spain      2064   413   2477
------------------------------------------------------------------------------------------------------------------------
  • The percentage of customers that exited was the most in Germany at ~32%. ~17% exited from Spain and ~16% from France.

Exited Vs Gender

In [ ]:
stacked_barplot(ds, "Gender", "Exited" )
Exited     0     1    All
Gender                   
All     7963  2037  10000
Female  3404  1139   4543
Male    4559   898   5457
------------------------------------------------------------------------------------------------------------------------
  • The percentage of female customers that exited is ~25% and ~16% for male customers. So, a higher percentage of females exited than males.

Exited Vs Has Credit Card

In [ ]:
stacked_barplot(ds, "HasCrCard", "Exited" )
  • The percentage of customers that exited who had or did not have credit cards are similar at ~20 to 21%. So, this is not a factor in terms of determining exiting customers.

Exited Vs Is active member

In [ ]:
stacked_barplot(ds, "IsActiveMember", "Exited" )
  • A higher percentage of non active members, ~27%, exited, than active members, ~14%.

Exited Vs Credit Score

In [ ]:
distribution_plot_wrt_target(ds, "CreditScore", "Exited")
  • The distribution of Credit Score for customers who exited and those that did not is mostly normal, with just a slight of left skewness.
  • The median credit score for the customers that did not exit is ~653, while for those that exited is ~646.
  • There are some outliers on the lower credit score side for customers that exited.

Exited Vs Age

In [ ]:
distribution_plot_wrt_target(ds, "Age", "Exited")
  • The Age distribution for customers that did not exit is left skewed but is normal for those hat exited.
  • Those that exited have a higher median age at ~45 than those that did not exit at ~37.
  • Both groups have outliers on the higher side of the age data.

Exited Vs Tenure

In [ ]:
plt.figure(figsize=(5,5))
sns.boxplot(y='Tenure',x='Exited',data=ds)
plt.show()
  • The mean and median tenure of both customers that exited and those that did not are similar at 5.

Exited Vs Balance

In [ ]:
distribution_plot_wrt_target(ds, "Balance", "Exited")
  • 25% of the customers that did not exit, have a minimum balance of 0. Due to that fact, the data distribution is initially to the left but for the rest of the data, the distribution of Balance is normal. The distribution is similar for the customers that exited.
  • The median of the customers that did not exit is ~92K and ~109K for those that exited. So, customers with lower balances exited less than those with higher balances.

Exited Vs Number of Products

In [ ]:
plt.figure(figsize=(5,5))
sns.boxplot(y='NumOfProducts',x='Exited',data=ds)
plt.show()
  • The median of customers that did not exit is 2, while for those that exited is 1 product.
  • 25% of customers for both groups have 1 product, while 75% of the customers for both groups have 2 products.
  • There are outliers in the data for customers who exited.

Exited Vs Estimated Salary

In [ ]:
distribution_plot_wrt_target(ds, "EstimatedSalary", "Exited")
  • The distribution of Estimated Salary for both the customers that exited and those that did not exit, are similarly uniform.
  • The median Estimated Salary for customers that did not exit is ~100K and ~102K for those that exited.

Data Preprocessing

Dummy Variable Creation

In [49]:
ds = pd.get_dummies(ds,columns=ds.select_dtypes(include=["object"]).columns.tolist(),drop_first=True)
ds = ds.astype(float)
ds.head()
Out[49]:
CreditScore Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited Geography_Germany Geography_Spain Gender_Male
0 619.0 42.0 2.0 0.00 1.0 1.0 1.0 101348.88 1.0 0.0 0.0 0.0
1 608.0 41.0 1.0 83807.86 1.0 0.0 1.0 112542.58 0.0 0.0 1.0 0.0
2 502.0 42.0 8.0 159660.80 3.0 1.0 0.0 113931.57 1.0 0.0 0.0 0.0
3 699.0 39.0 1.0 0.00 2.0 0.0 0.0 93826.63 0.0 0.0 0.0 0.0
4 850.0 43.0 2.0 125510.82 1.0 1.0 1.0 79084.10 0.0 0.0 1.0 0.0

Train-validation-test Split

In [50]:
# set up the independent and dependent variables
X = ds.drop(['Exited'],axis=1)
y = ds['Exited'] # Exited
In [51]:
# Splitting the dataset into the Training and Testing set.

X_large, X_test, y_large, y_test = train_test_split(X, y, test_size = 0.15, random_state = 42,stratify=y,shuffle = True)
In [52]:
# Splitting the dataset into the Training and Validation set.

X_train, X_val, y_train, y_val = train_test_split(X_large, y_large, test_size = 0.17647, random_state = 42,stratify=y_large, shuffle = True)
  • Split used: 70/15/15
In [53]:
print(X_train.shape, X_val.shape, X_test.shape)
(7000, 11) (1500, 11) (1500, 11)
In [54]:
print(y_train.shape, y_val.shape, y_test.shape)
(7000,) (1500,) (1500,)

Data Normalization

Since all the numerical values are on a different scales, bring all of them to the same scale by scaling.

In [55]:
# creating an instance of the standard scaler
sc = StandardScaler()

X_train[cols_list] = sc.fit_transform(X_train[cols_list])
X_val[cols_list] = sc.transform(X_val[cols_list])
X_test[cols_list] = sc.transform(X_test[cols_list])

Model Building

Model Evaluation Criterion

  • The model can make wrong predictions by either predicting that a customer will exit and the customer doesn't exit or predicting that a customer will not exit and the customer exits.
  • Of these two, the more important case is the latter, which is the model predicting that a customer will not exit but the customer exits, as this will amount to losing a valuable customer, potential business and revenue.
  • These are the False Negatives. So, Recall, would be the metric to look at so that the bank can retain their valuable customers by identifying the customers who are at risk of exiting.
  • The bank would want Recall to be maximized, as the greater the Recall, the higher the chances of minimizing false negatives. Hence, the focus should be on increasing Recall or minimizing the false negatives or in other words identifying the true positives(i.e. Exited = 1)

Functions & Set Ups

Let's create a function for plotting the confusion matrix

In [2113]:
def make_confusion_matrix(actual_targets, predicted_targets):
    """
    To plot the confusion_matrix with percentages

    actual_targets: actual target (dependent) variable values
    predicted_targets: predicted target (dependent) variable values
    """
    cm = confusion_matrix(actual_targets, predicted_targets)
    labels = np.asarray(
        [
            ["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
            for item in cm.flatten()
        ]
    ).reshape(cm.shape[0], cm.shape[1])

    plt.figure(figsize=(6, 4))
    sns.heatmap(cm, annot=labels, fmt="")
    plt.ylabel("True label")
    plt.xlabel("Predicted label")

Create the plot function for repetitive use

In [2114]:
def plot(history, name):
    """
    Function to plot loss/accuracy

    history: an object which stores the metrics and losses.
    name: can be one of Loss or Accuracy
    """
    fig, ax = plt.subplots() #Creating a subplot with figure and axes.
    plt.plot(history.history[name]) #Plotting the train accuracy or train loss
    plt.plot(history.history['val_'+name]) #Plotting the validation accuracy or validation loss

    plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
    plt.ylabel(name.capitalize()) #Capitalizing the first letter.
    plt.xlabel('Epoch') #Defining the label for the x-axis.
    fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.

Let's create two blank dataframes that will store the recall values for all the models we build.

In [2115]:
train_metric_df = pd.DataFrame(columns=["recall"])
valid_metric_df = pd.DataFrame(columns=["recall"])

Create a more comprehensive results dataframe

In [2116]:
columns = ["# hidden layers","# neurons - hidden layer","activation function - hidden layer ","# epochs","batch size","optimizer","time(secs)","Train_loss","Valid_loss","Train_Recall","Valid_Recall"]

results = pd.DataFrame(columns=columns)

Neural Network with SGD Optimizer (NO HIDDEN LAYER)

  • Let's start with this as a baseline model - no hidden layers
In [2117]:
# Clear the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2118]:
# Initializing the neural network
model_nhl = Sequential()

# Adding the input layer with neurons
model_nhl.add(Dense(1, activation = 'sigmoid', input_dim = X_train.shape[1]))
In [2119]:
# Use SGD as the optimizer.
optimizer = tf.keras.optimizers.SGD(0.001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2120]:
## Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_nhl.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2121]:
model_nhl.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 1)                   │              12 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 12 (48.00 B)
 Trainable params: 12 (48.00 B)
 Non-trainable params: 0 (0.00 B)
In [2122]:
epochs = 50
batch_size = X_train.shape[0]
In [2123]:
# Get start time
start = time.time()

# Fitting the ANN

history_nhl = model_nhl.fit(
    X_train, y_train,
    batch_size=batch_size,
    validation_data=(X_val,y_val),
    epochs=epochs,
    verbose=1
)

# Get end time
end=time.time()
Epoch 1/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 518ms/step - loss: 1.1625 - recall: 0.4972 - val_loss: 1.1488 - val_recall: 0.5246
Epoch 2/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step - loss: 1.1620 - recall: 0.4972 - val_loss: 1.1483 - val_recall: 0.5246
Epoch 3/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 161ms/step - loss: 1.1615 - recall: 0.4972 - val_loss: 1.1477 - val_recall: 0.5246
Epoch 4/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step - loss: 1.1610 - recall: 0.4972 - val_loss: 1.1472 - val_recall: 0.5246
Epoch 5/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step - loss: 1.1604 - recall: 0.4972 - val_loss: 1.1467 - val_recall: 0.5246
Epoch 6/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 152ms/step - loss: 1.1599 - recall: 0.4965 - val_loss: 1.1462 - val_recall: 0.5246
Epoch 7/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 119ms/step - loss: 1.1594 - recall: 0.4965 - val_loss: 1.1456 - val_recall: 0.5246
Epoch 8/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 147ms/step - loss: 1.1588 - recall: 0.4965 - val_loss: 1.1451 - val_recall: 0.5246
Epoch 9/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 129ms/step - loss: 1.1583 - recall: 0.4958 - val_loss: 1.1446 - val_recall: 0.5246
Epoch 10/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 155ms/step - loss: 1.1578 - recall: 0.4958 - val_loss: 1.1441 - val_recall: 0.5246
Epoch 11/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step - loss: 1.1573 - recall: 0.4958 - val_loss: 1.1436 - val_recall: 0.5246
Epoch 12/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step - loss: 1.1567 - recall: 0.4951 - val_loss: 1.1430 - val_recall: 0.5246
Epoch 13/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 141ms/step - loss: 1.1562 - recall: 0.4951 - val_loss: 1.1425 - val_recall: 0.5246
Epoch 14/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 153ms/step - loss: 1.1557 - recall: 0.4930 - val_loss: 1.1420 - val_recall: 0.5213
Epoch 15/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 126ms/step - loss: 1.1552 - recall: 0.4930 - val_loss: 1.1415 - val_recall: 0.5213
Epoch 16/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - loss: 1.1546 - recall: 0.4930 - val_loss: 1.1410 - val_recall: 0.5213
Epoch 17/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 146ms/step - loss: 1.1541 - recall: 0.4930 - val_loss: 1.1405 - val_recall: 0.5213
Epoch 18/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - loss: 1.1536 - recall: 0.4930 - val_loss: 1.1399 - val_recall: 0.5213
Epoch 19/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 133ms/step - loss: 1.1531 - recall: 0.4930 - val_loss: 1.1394 - val_recall: 0.5213
Epoch 20/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step - loss: 1.1526 - recall: 0.4923 - val_loss: 1.1389 - val_recall: 0.5213
Epoch 21/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 126ms/step - loss: 1.1520 - recall: 0.4923 - val_loss: 1.1384 - val_recall: 0.5213
Epoch 22/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step - loss: 1.1515 - recall: 0.4923 - val_loss: 1.1379 - val_recall: 0.5213
Epoch 23/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 85ms/step - loss: 1.1510 - recall: 0.4923 - val_loss: 1.1374 - val_recall: 0.5213
Epoch 24/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - loss: 1.1505 - recall: 0.4916 - val_loss: 1.1368 - val_recall: 0.5213
Epoch 25/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 134ms/step - loss: 1.1500 - recall: 0.4916 - val_loss: 1.1363 - val_recall: 0.5213
Epoch 26/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 94ms/step - loss: 1.1494 - recall: 0.4916 - val_loss: 1.1358 - val_recall: 0.5213
Epoch 27/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - loss: 1.1489 - recall: 0.4916 - val_loss: 1.1353 - val_recall: 0.5213
Epoch 28/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 148ms/step - loss: 1.1484 - recall: 0.4902 - val_loss: 1.1348 - val_recall: 0.5213
Epoch 29/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 139ms/step - loss: 1.1479 - recall: 0.4902 - val_loss: 1.1343 - val_recall: 0.5213
Epoch 30/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 141ms/step - loss: 1.1474 - recall: 0.4902 - val_loss: 1.1338 - val_recall: 0.5213
Epoch 31/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step - loss: 1.1468 - recall: 0.4902 - val_loss: 1.1333 - val_recall: 0.5213
Epoch 32/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step - loss: 1.1463 - recall: 0.4902 - val_loss: 1.1327 - val_recall: 0.5213
Epoch 33/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - loss: 1.1458 - recall: 0.4895 - val_loss: 1.1322 - val_recall: 0.5213
Epoch 34/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 144ms/step - loss: 1.1453 - recall: 0.4895 - val_loss: 1.1317 - val_recall: 0.5213
Epoch 35/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step - loss: 1.1448 - recall: 0.4895 - val_loss: 1.1312 - val_recall: 0.5180
Epoch 36/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 1.1443 - recall: 0.4888 - val_loss: 1.1307 - val_recall: 0.5180
Epoch 37/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 1.1437 - recall: 0.4881 - val_loss: 1.1302 - val_recall: 0.5180
Epoch 38/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step - loss: 1.1432 - recall: 0.4881 - val_loss: 1.1297 - val_recall: 0.5180
Epoch 39/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - loss: 1.1427 - recall: 0.4881 - val_loss: 1.1292 - val_recall: 0.5180
Epoch 40/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 1.1422 - recall: 0.4881 - val_loss: 1.1287 - val_recall: 0.5180
Epoch 41/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 1.1417 - recall: 0.4874 - val_loss: 1.1282 - val_recall: 0.5180
Epoch 42/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step - loss: 1.1412 - recall: 0.4874 - val_loss: 1.1276 - val_recall: 0.5180
Epoch 43/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step - loss: 1.1407 - recall: 0.4874 - val_loss: 1.1271 - val_recall: 0.5180
Epoch 44/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step - loss: 1.1401 - recall: 0.4874 - val_loss: 1.1266 - val_recall: 0.5180
Epoch 45/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step - loss: 1.1396 - recall: 0.4874 - val_loss: 1.1261 - val_recall: 0.5180
Epoch 46/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step - loss: 1.1391 - recall: 0.4874 - val_loss: 1.1256 - val_recall: 0.5180
Epoch 47/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 1.1386 - recall: 0.4874 - val_loss: 1.1251 - val_recall: 0.5180
Epoch 48/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - loss: 1.1381 - recall: 0.4867 - val_loss: 1.1246 - val_recall: 0.5180
Epoch 49/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 151ms/step - loss: 1.1376 - recall: 0.4867 - val_loss: 1.1241 - val_recall: 0.5180
Epoch 50/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - loss: 1.1371 - recall: 0.4867 - val_loss: 1.1236 - val_recall: 0.5180
In [2124]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  6.150696754455566
In [2125]:
model_nhl.evaluate(X_train,y_train)
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 1.1557 - recall: 0.4883
Out[2125]:
[1.1365574598312378, 0.48667600750923157]
In [2126]:
model_nhl.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 1.1442 - recall: 0.4897
Out[2126]:
[1.123594880104065, 0.5180327892303467]

Loss function

In [2127]:
# Plotting Train Loss vs Validation Loss
plot(history_nhl, 'loss')

Recall

In [2128]:
# Plotting Train recall vs Validation recall
plot(history_nhl, 'recall')
In [2129]:
# Predicting the results using best as a threshold
y_train_pred = model_nhl.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2129]:
array([[False],
       [ True],
       [ True],
       ...,
       [ True],
       [ True],
       [ True]])
In [2130]:
# Predicting the results using best as a threshold
y_val_pred = model_nhl.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 982us/step
Out[2130]:
array([[ True],
       [ True],
       [ True],
       ...,
       [False],
       [ True],
       [ True]])
In [2131]:
model_name = "NN with SGD with No Hidden Layers"

train_metric_df.loc[model_name] = recall_score(y_train, y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val, y_val_pred)
In [2132]:
results.loc[model_name]=['-','-','-',epochs,batch_size,'SGD',(end-start),history_nhl.history["loss"][-1],history_nhl.history["val_loss"][-1],history_nhl.history["recall"][-1],history_nhl.history["val_recall"][-1]]
In [2133]:
results
Out[2133]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033

Classification report

In [2134]:
# Classification report
cr = classification_report(y_train, y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.64      0.23      0.34      5574
         1.0       0.14      0.49      0.22      1426

    accuracy                           0.28      7000
   macro avg       0.39      0.36      0.28      7000
weighted avg       0.54      0.28      0.31      7000

In [2135]:
# Classification report
cr = classification_report(y_val, y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.66      0.24      0.36      1195
         1.0       0.15      0.52      0.23       305

    accuracy                           0.30      1500
   macro avg       0.41      0.38      0.29      1500
weighted avg       0.56      0.30      0.33      1500

Confusion matrix

In [2136]:
make_confusion_matrix(y_train, y_train_pred)
In [2137]:
make_confusion_matrix(y_val, y_val_pred)
  • This Recall score is not very high. The losses seem to be high. Let's proceed to improve these.

Model Performance Improvement

Neural Network with SGD Optimizer

  • 1 Hidden Layer [14 neurons]
In [2138]:
# Clear the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2139]:
# Initializing the neural network
model_0 = Sequential()

# Adding the input layer with neurons and relu as the activation function
model_0.add(Dense(14, activation='relu', input_dim = X_train.shape[1]))

# Adding the output layer with 1 neuron for binary output and using sigmoid for binary output
model_0.add(Dense(1, activation = 'sigmoid'))
In [2140]:
# Use SGD as the optimizer.
optimizer = tf.keras.optimizers.SGD(0.001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2141]:
## Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_0.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2142]:
model_0.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 14)                  │             168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 1)                   │              15 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 183 (732.00 B)
 Trainable params: 183 (732.00 B)
 Non-trainable params: 0 (0.00 B)
In [2143]:
epochs = 50
batch_size = X_train.shape[0]
In [2144]:
# Get start time
start = time.time()

# Fitting the ANN

history_0 = model_0.fit(
    X_train, y_train,
    batch_size=batch_size,
    validation_data=(X_val,y_val),
    epochs=epochs,
    verbose=1
)

# Get end time
end=time.time()
Epoch 1/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 1s/step - loss: 0.8812 - recall: 0.8710 - val_loss: 0.8726 - val_recall: 0.8721
Epoch 2/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 130ms/step - loss: 0.8805 - recall: 0.8710 - val_loss: 0.8720 - val_recall: 0.8689
Epoch 3/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 135ms/step - loss: 0.8798 - recall: 0.8696 - val_loss: 0.8713 - val_recall: 0.8689
Epoch 4/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 145ms/step - loss: 0.8792 - recall: 0.8689 - val_loss: 0.8706 - val_recall: 0.8689
Epoch 5/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 134ms/step - loss: 0.8785 - recall: 0.8689 - val_loss: 0.8699 - val_recall: 0.8689
Epoch 6/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step - loss: 0.8778 - recall: 0.8689 - val_loss: 0.8693 - val_recall: 0.8689
Epoch 7/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 0.8771 - recall: 0.8682 - val_loss: 0.8686 - val_recall: 0.8656
Epoch 8/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - loss: 0.8764 - recall: 0.8675 - val_loss: 0.8679 - val_recall: 0.8656
Epoch 9/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step - loss: 0.8757 - recall: 0.8654 - val_loss: 0.8673 - val_recall: 0.8656
Epoch 10/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step - loss: 0.8750 - recall: 0.8647 - val_loss: 0.8666 - val_recall: 0.8656
Epoch 11/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step - loss: 0.8744 - recall: 0.8647 - val_loss: 0.8659 - val_recall: 0.8623
Epoch 12/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 146ms/step - loss: 0.8737 - recall: 0.8647 - val_loss: 0.8653 - val_recall: 0.8557
Epoch 13/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step - loss: 0.8730 - recall: 0.8633 - val_loss: 0.8646 - val_recall: 0.8525
Epoch 14/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step - loss: 0.8723 - recall: 0.8633 - val_loss: 0.8640 - val_recall: 0.8525
Epoch 15/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 128ms/step - loss: 0.8716 - recall: 0.8633 - val_loss: 0.8633 - val_recall: 0.8459
Epoch 16/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 76ms/step - loss: 0.8710 - recall: 0.8626 - val_loss: 0.8627 - val_recall: 0.8459
Epoch 17/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step - loss: 0.8703 - recall: 0.8619 - val_loss: 0.8620 - val_recall: 0.8459
Epoch 18/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 119ms/step - loss: 0.8696 - recall: 0.8619 - val_loss: 0.8614 - val_recall: 0.8459
Epoch 19/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 0.8690 - recall: 0.8604 - val_loss: 0.8607 - val_recall: 0.8459
Epoch 20/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - loss: 0.8683 - recall: 0.8604 - val_loss: 0.8600 - val_recall: 0.8459
Epoch 21/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - loss: 0.8676 - recall: 0.8597 - val_loss: 0.8594 - val_recall: 0.8459
Epoch 22/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - loss: 0.8670 - recall: 0.8590 - val_loss: 0.8588 - val_recall: 0.8426
Epoch 23/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 86ms/step - loss: 0.8663 - recall: 0.8590 - val_loss: 0.8581 - val_recall: 0.8426
Epoch 24/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step - loss: 0.8656 - recall: 0.8583 - val_loss: 0.8575 - val_recall: 0.8426
Epoch 25/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 149ms/step - loss: 0.8650 - recall: 0.8576 - val_loss: 0.8568 - val_recall: 0.8426
Epoch 26/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 123ms/step - loss: 0.8643 - recall: 0.8576 - val_loss: 0.8562 - val_recall: 0.8426
Epoch 27/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step - loss: 0.8636 - recall: 0.8576 - val_loss: 0.8555 - val_recall: 0.8426
Epoch 28/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 0.8630 - recall: 0.8569 - val_loss: 0.8549 - val_recall: 0.8426
Epoch 29/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - loss: 0.8623 - recall: 0.8562 - val_loss: 0.8543 - val_recall: 0.8426
Epoch 30/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - loss: 0.8617 - recall: 0.8555 - val_loss: 0.8536 - val_recall: 0.8393
Epoch 31/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 0.8610 - recall: 0.8555 - val_loss: 0.8530 - val_recall: 0.8328
Epoch 32/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step - loss: 0.8604 - recall: 0.8548 - val_loss: 0.8523 - val_recall: 0.8328
Epoch 33/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step - loss: 0.8597 - recall: 0.8534 - val_loss: 0.8517 - val_recall: 0.8328
Epoch 34/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 137ms/step - loss: 0.8591 - recall: 0.8534 - val_loss: 0.8511 - val_recall: 0.8328
Epoch 35/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 117ms/step - loss: 0.8584 - recall: 0.8513 - val_loss: 0.8504 - val_recall: 0.8295
Epoch 36/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - loss: 0.8578 - recall: 0.8506 - val_loss: 0.8498 - val_recall: 0.8262
Epoch 37/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step - loss: 0.8571 - recall: 0.8499 - val_loss: 0.8492 - val_recall: 0.8262
Epoch 38/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - loss: 0.8565 - recall: 0.8499 - val_loss: 0.8486 - val_recall: 0.8262
Epoch 39/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - loss: 0.8558 - recall: 0.8499 - val_loss: 0.8479 - val_recall: 0.8262
Epoch 40/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - loss: 0.8552 - recall: 0.8492 - val_loss: 0.8473 - val_recall: 0.8230
Epoch 41/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - loss: 0.8546 - recall: 0.8492 - val_loss: 0.8467 - val_recall: 0.8230
Epoch 42/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 76ms/step - loss: 0.8539 - recall: 0.8485 - val_loss: 0.8461 - val_recall: 0.8230
Epoch 43/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 121ms/step - loss: 0.8533 - recall: 0.8478 - val_loss: 0.8454 - val_recall: 0.8230
Epoch 44/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - loss: 0.8526 - recall: 0.8464 - val_loss: 0.8448 - val_recall: 0.8197
Epoch 45/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step - loss: 0.8520 - recall: 0.8464 - val_loss: 0.8442 - val_recall: 0.8164
Epoch 46/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 122ms/step - loss: 0.8514 - recall: 0.8457 - val_loss: 0.8436 - val_recall: 0.8164
Epoch 47/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 139ms/step - loss: 0.8507 - recall: 0.8443 - val_loss: 0.8430 - val_recall: 0.8164
Epoch 48/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 0.8501 - recall: 0.8429 - val_loss: 0.8423 - val_recall: 0.8164
Epoch 49/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 0.8495 - recall: 0.8422 - val_loss: 0.8417 - val_recall: 0.8131
Epoch 50/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step - loss: 0.8488 - recall: 0.8408 - val_loss: 0.8411 - val_recall: 0.8098
In [2145]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  5.692543983459473
In [2146]:
model_0.evaluate(X_train,y_train)
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.8521 - recall: 0.8410
Out[2146]:
[0.8482104539871216, 0.8401122093200684]
In [2147]:
model_0.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.8509 - recall: 0.7971
Out[2147]:
[0.8411220908164978, 0.8098360896110535]

Loss function

In [2148]:
# Plotting Train Loss vs Validation Loss
plot(history_0, 'loss')

Recall

In [2149]:
# Plotting Train recall vs Validation recall
plot(history_0, 'recall')
In [2150]:
# Predicting the results using best as a threshold
y_train_pred = model_0.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2150]:
array([[ True],
       [ True],
       [ True],
       ...,
       [ True],
       [ True],
       [ True]])
In [2151]:
# Predicting the results using best as a threshold
y_val_pred = model_0.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2151]:
array([[ True],
       [ True],
       [ True],
       ...,
       [ True],
       [ True],
       [False]])
In [2152]:
model_name = "NN with SGD - 1 Hidden Layer"

train_metric_df.loc[model_name] = recall_score(y_train, y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val, y_val_pred)
In [2153]:
results.loc[model_name]=['1','14','relu',epochs,batch_size,'SGD',(end-start),history_0.history["loss"][-1],history_0.history["val_loss"][-1],history_0.history["recall"][-1],history_0.history["val_recall"][-1]]
In [2154]:
results
Out[2154]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836

Classification report

In [2155]:
# Classification report
cr = classification_report(y_train, y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.76      0.13      0.22      5574
         1.0       0.20      0.84      0.32      1426

    accuracy                           0.28      7000
   macro avg       0.48      0.49      0.27      7000
weighted avg       0.65      0.28      0.24      7000

In [2156]:
# Classification report
cr = classification_report(y_val, y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.73      0.13      0.23      1195
         1.0       0.19      0.81      0.31       305

    accuracy                           0.27      1500
   macro avg       0.46      0.47      0.27      1500
weighted avg       0.62      0.27      0.24      1500

Confusion matrix

In [2157]:
make_confusion_matrix(y_train, y_train_pred)
In [2158]:
make_confusion_matrix(y_val, y_val_pred)
  • This Recall score is not bad at all and the loss improved.
  • The addition of one hidden layer to the baseline model (no hidden layers), brought about a dramatic improvement in the Recall score. The loss also reduced quite a bit.
  • I noticed that the Recall goes down, though, as the epochs increase. Let's try to improve on this.

Neural Network with SGD Optimizer

  • 2 Hidden Layers [14,7 neurons]
In [2159]:
# Clear the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2160]:
# Initializing the neural network
model_01 = Sequential()

# Adding the input layer with neurons and relu as the activation function
model_01.add(Dense(14, activation='relu', input_dim = X_train.shape[1]))

# Adding one hidden layer with neurons and Relu as the activation function
model_01.add(Dense(7, activation='relu'))

# Adding the output layer with 1 neuron for binary output and using sigmoid for binary output
model_01.add(Dense(1, activation = 'sigmoid'))
In [2161]:
# Use SGD as the optimizer.
optimizer = tf.keras.optimizers.SGD(0.001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2162]:
## Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_01.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2163]:
model_01.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 14)                  │             168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 7)                   │             105 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │               8 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 281 (1.10 KB)
 Trainable params: 281 (1.10 KB)
 Non-trainable params: 0 (0.00 B)
In [2164]:
epochs = 50
batch_size = X_train.shape[0]
In [2165]:
# Get start time
start = time.time()

# Fitting the ANN

history_01 = model_01.fit(
    X_train, y_train,
    batch_size=batch_size,
    validation_data=(X_val,y_val),
    epochs=epochs,
    verbose=1
)

# Get end time
end=time.time()
Epoch 1/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 640ms/step - loss: 0.8526 - recall: 0.9607 - val_loss: 0.8546 - val_recall: 0.9803
Epoch 2/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 78ms/step - loss: 0.8521 - recall: 0.9607 - val_loss: 0.8541 - val_recall: 0.9803
Epoch 3/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 118ms/step - loss: 0.8516 - recall: 0.9607 - val_loss: 0.8536 - val_recall: 0.9803
Epoch 4/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - loss: 0.8511 - recall: 0.9600 - val_loss: 0.8531 - val_recall: 0.9803
Epoch 5/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step - loss: 0.8506 - recall: 0.9600 - val_loss: 0.8526 - val_recall: 0.9803
Epoch 6/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 128ms/step - loss: 0.8501 - recall: 0.9600 - val_loss: 0.8521 - val_recall: 0.9803
Epoch 7/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - loss: 0.8496 - recall: 0.9593 - val_loss: 0.8516 - val_recall: 0.9803
Epoch 8/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 139ms/step - loss: 0.8491 - recall: 0.9586 - val_loss: 0.8511 - val_recall: 0.9803
Epoch 9/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 0.8486 - recall: 0.9579 - val_loss: 0.8506 - val_recall: 0.9803
Epoch 10/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 55ms/step - loss: 0.8481 - recall: 0.9572 - val_loss: 0.8501 - val_recall: 0.9770
Epoch 11/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step - loss: 0.8476 - recall: 0.9565 - val_loss: 0.8496 - val_recall: 0.9770
Epoch 12/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 128ms/step - loss: 0.8471 - recall: 0.9558 - val_loss: 0.8491 - val_recall: 0.9770
Epoch 13/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 149ms/step - loss: 0.8466 - recall: 0.9551 - val_loss: 0.8486 - val_recall: 0.9770
Epoch 14/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 119ms/step - loss: 0.8461 - recall: 0.9551 - val_loss: 0.8481 - val_recall: 0.9770
Epoch 15/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 159ms/step - loss: 0.8456 - recall: 0.9551 - val_loss: 0.8476 - val_recall: 0.9770
Epoch 16/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 133ms/step - loss: 0.8451 - recall: 0.9544 - val_loss: 0.8471 - val_recall: 0.9770
Epoch 17/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 170ms/step - loss: 0.8446 - recall: 0.9537 - val_loss: 0.8466 - val_recall: 0.9770
Epoch 18/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 108ms/step - loss: 0.8441 - recall: 0.9530 - val_loss: 0.8461 - val_recall: 0.9770
Epoch 19/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 124ms/step - loss: 0.8436 - recall: 0.9523 - val_loss: 0.8456 - val_recall: 0.9770
Epoch 20/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 93ms/step - loss: 0.8431 - recall: 0.9523 - val_loss: 0.8451 - val_recall: 0.9770
Epoch 21/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step - loss: 0.8426 - recall: 0.9523 - val_loss: 0.8446 - val_recall: 0.9770
Epoch 22/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 136ms/step - loss: 0.8421 - recall: 0.9523 - val_loss: 0.8442 - val_recall: 0.9770
Epoch 23/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 136ms/step - loss: 0.8417 - recall: 0.9523 - val_loss: 0.8437 - val_recall: 0.9738
Epoch 24/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step - loss: 0.8412 - recall: 0.9523 - val_loss: 0.8432 - val_recall: 0.9738
Epoch 25/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 139ms/step - loss: 0.8407 - recall: 0.9516 - val_loss: 0.8427 - val_recall: 0.9738
Epoch 26/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 145ms/step - loss: 0.8402 - recall: 0.9516 - val_loss: 0.8422 - val_recall: 0.9738
Epoch 27/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 150ms/step - loss: 0.8397 - recall: 0.9509 - val_loss: 0.8417 - val_recall: 0.9738
Epoch 28/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - loss: 0.8392 - recall: 0.9509 - val_loss: 0.8413 - val_recall: 0.9738
Epoch 29/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step - loss: 0.8388 - recall: 0.9502 - val_loss: 0.8408 - val_recall: 0.9705
Epoch 30/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - loss: 0.8383 - recall: 0.9495 - val_loss: 0.8403 - val_recall: 0.9705
Epoch 31/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 141ms/step - loss: 0.8378 - recall: 0.9495 - val_loss: 0.8398 - val_recall: 0.9705
Epoch 32/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 133ms/step - loss: 0.8373 - recall: 0.9495 - val_loss: 0.8393 - val_recall: 0.9705
Epoch 33/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 135ms/step - loss: 0.8368 - recall: 0.9495 - val_loss: 0.8389 - val_recall: 0.9705
Epoch 34/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 147ms/step - loss: 0.8364 - recall: 0.9495 - val_loss: 0.8384 - val_recall: 0.9672
Epoch 35/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 116ms/step - loss: 0.8359 - recall: 0.9481 - val_loss: 0.8379 - val_recall: 0.9672
Epoch 36/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - loss: 0.8354 - recall: 0.9467 - val_loss: 0.8374 - val_recall: 0.9672
Epoch 37/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 148ms/step - loss: 0.8349 - recall: 0.9467 - val_loss: 0.8370 - val_recall: 0.9672
Epoch 38/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step - loss: 0.8345 - recall: 0.9467 - val_loss: 0.8365 - val_recall: 0.9672
Epoch 39/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 126ms/step - loss: 0.8340 - recall: 0.9467 - val_loss: 0.8360 - val_recall: 0.9672
Epoch 40/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step - loss: 0.8335 - recall: 0.9467 - val_loss: 0.8356 - val_recall: 0.9672
Epoch 41/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 181ms/step - loss: 0.8330 - recall: 0.9467 - val_loss: 0.8351 - val_recall: 0.9639
Epoch 42/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 264ms/step - loss: 0.8326 - recall: 0.9467 - val_loss: 0.8346 - val_recall: 0.9639
Epoch 43/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 123ms/step - loss: 0.8321 - recall: 0.9467 - val_loss: 0.8342 - val_recall: 0.9639
Epoch 44/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 151ms/step - loss: 0.8316 - recall: 0.9467 - val_loss: 0.8337 - val_recall: 0.9639
Epoch 45/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - loss: 0.8312 - recall: 0.9467 - val_loss: 0.8332 - val_recall: 0.9639
Epoch 46/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - loss: 0.8307 - recall: 0.9467 - val_loss: 0.8328 - val_recall: 0.9639
Epoch 47/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step - loss: 0.8302 - recall: 0.9467 - val_loss: 0.8323 - val_recall: 0.9639
Epoch 48/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step - loss: 0.8298 - recall: 0.9460 - val_loss: 0.8318 - val_recall: 0.9639
Epoch 49/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 135ms/step - loss: 0.8293 - recall: 0.9453 - val_loss: 0.8314 - val_recall: 0.9639
Epoch 50/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 80ms/step - loss: 0.8288 - recall: 0.9453 - val_loss: 0.8309 - val_recall: 0.9639
In [2166]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  6.878960609436035
In [2167]:
model_01.evaluate(X_train,y_train)
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.8310 - recall: 0.9527
Out[2167]:
[0.8283853530883789, 0.9453015327453613]
In [2168]:
model_01.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.8359 - recall: 0.9775
Out[2168]:
[0.830915093421936, 0.9639344215393066]

Loss function

In [2169]:
# Plotting Train Loss vs Validation Loss
plot(history_01, 'loss')

Recall

In [2170]:
# Plotting Train recall vs Validation recall
plot(history_01, 'recall')
In [2171]:
# Predicting the results using best as a threshold
y_train_pred = model_01.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2171]:
array([[ True],
       [False],
       [ True],
       ...,
       [ True],
       [ True],
       [ True]])
In [2172]:
# Predicting the results using best as a threshold
y_val_pred = model_01.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[2172]:
array([[ True],
       [ True],
       [ True],
       ...,
       [ True],
       [False],
       [False]])
In [2173]:
model_name = "NN with SGD - 2 Hidden Layers [14,7]"

train_metric_df.loc[model_name] = recall_score(y_train, y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val, y_val_pred)
In [2174]:
results.loc[model_name]=['2','14,7','relu,relu',epochs,batch_size,'SGD',(end-start),history_01.history["loss"][-1],history_01.history["val_loss"][-1],history_01.history["recall"][-1],history_01.history["val_recall"][-1]]
In [2175]:
results
Out[2175]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934

Classification report

In [2176]:
# Classification report
cr = classification_report(y_train, y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.90      0.12      0.22      5574
         1.0       0.22      0.95      0.35      1426

    accuracy                           0.29      7000
   macro avg       0.56      0.53      0.28      7000
weighted avg       0.76      0.29      0.24      7000

In [2177]:
# Classification report
cr = classification_report(y_val, y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.93      0.12      0.21      1195
         1.0       0.22      0.96      0.36       305

    accuracy                           0.29      1500
   macro avg       0.57      0.54      0.28      1500
weighted avg       0.78      0.29      0.24      1500

Confusion matrix

In [2178]:
make_confusion_matrix(y_train, y_train_pred)
In [2179]:
make_confusion_matrix(y_val, y_val_pred)
  • This Recall score is very much improved and the loss improved also.
  • The addition of a secong hidden layer to the baseline model resulted in even more performance improvement as both the Recall score increased and the loss reduced further.
  • I noticed that the Recall still goes down, though, as the epochs increase. Let's try to improve on this.

Neural Network with SGD Optimizer

  • 2 Hidden Layers [64,32 neurons]
In [2180]:
# Clear the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2181]:
# Initializing the neural network
model_02 = Sequential()

# Adding the input layer with neurons and relu as the activation function
model_02.add(Dense(64, activation='relu', input_dim = X_train.shape[1]))

# Adding one hidden layer with neurons and Relu as the activation function
model_02.add(Dense(32, activation='relu'))

# Adding the output layer with 1 neuron for binary output and using sigmoid for binary output
model_02.add(Dense(1, activation = 'sigmoid'))
In [2182]:
# Use SGD as the optimizer.
optimizer = tf.keras.optimizers.SGD(0.001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2183]:
## Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_02.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2184]:
model_02.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 64)                  │             768 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,881 (11.25 KB)
 Trainable params: 2,881 (11.25 KB)
 Non-trainable params: 0 (0.00 B)
In [2185]:
epochs = 50
batch_size = X_train.shape[0]
In [2186]:
# Get start time
start = time.time()

# Fitting the ANN

history_02 = model_02.fit(
    X_train, y_train,
    batch_size=batch_size,
    validation_data=(X_val,y_val),
    epochs=epochs,
    verbose=1
)

# Get end time
end=time.time()
Epoch 1/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 590ms/step - loss: 0.7020 - recall: 0.4327 - val_loss: 0.7008 - val_recall: 0.4590
Epoch 2/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 88ms/step - loss: 0.7014 - recall: 0.4285 - val_loss: 0.7002 - val_recall: 0.4525
Epoch 3/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 80ms/step - loss: 0.7007 - recall: 0.4257 - val_loss: 0.6996 - val_recall: 0.4492
Epoch 4/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 124ms/step - loss: 0.7001 - recall: 0.4236 - val_loss: 0.6990 - val_recall: 0.4393
Epoch 5/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - loss: 0.6995 - recall: 0.4222 - val_loss: 0.6984 - val_recall: 0.4328
Epoch 6/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 129ms/step - loss: 0.6990 - recall: 0.4201 - val_loss: 0.6978 - val_recall: 0.4295
Epoch 7/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 76ms/step - loss: 0.6984 - recall: 0.4165 - val_loss: 0.6972 - val_recall: 0.4295
Epoch 8/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 139ms/step - loss: 0.6978 - recall: 0.4144 - val_loss: 0.6966 - val_recall: 0.4230
Epoch 9/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - loss: 0.6972 - recall: 0.4116 - val_loss: 0.6960 - val_recall: 0.4164
Epoch 10/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - loss: 0.6966 - recall: 0.4109 - val_loss: 0.6954 - val_recall: 0.4164
Epoch 11/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step - loss: 0.6960 - recall: 0.4088 - val_loss: 0.6948 - val_recall: 0.4131
Epoch 12/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 67ms/step - loss: 0.6954 - recall: 0.4060 - val_loss: 0.6942 - val_recall: 0.4131
Epoch 13/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step - loss: 0.6948 - recall: 0.4039 - val_loss: 0.6937 - val_recall: 0.4098
Epoch 14/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 143ms/step - loss: 0.6942 - recall: 0.4018 - val_loss: 0.6931 - val_recall: 0.4098
Epoch 15/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - loss: 0.6937 - recall: 0.3990 - val_loss: 0.6925 - val_recall: 0.4098
Epoch 16/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - loss: 0.6931 - recall: 0.3976 - val_loss: 0.6919 - val_recall: 0.4033
Epoch 17/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step - loss: 0.6925 - recall: 0.3934 - val_loss: 0.6914 - val_recall: 0.4033
Epoch 18/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 79ms/step - loss: 0.6919 - recall: 0.3899 - val_loss: 0.6908 - val_recall: 0.4033
Epoch 19/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step - loss: 0.6914 - recall: 0.3878 - val_loss: 0.6902 - val_recall: 0.4033
Epoch 20/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - loss: 0.6908 - recall: 0.3850 - val_loss: 0.6896 - val_recall: 0.4000
Epoch 21/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - loss: 0.6902 - recall: 0.3836 - val_loss: 0.6891 - val_recall: 0.4000
Epoch 22/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - loss: 0.6897 - recall: 0.3822 - val_loss: 0.6885 - val_recall: 0.3967
Epoch 23/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 133ms/step - loss: 0.6891 - recall: 0.3780 - val_loss: 0.6879 - val_recall: 0.3934
Epoch 24/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 88ms/step - loss: 0.6885 - recall: 0.3759 - val_loss: 0.6874 - val_recall: 0.3934
Epoch 25/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step - loss: 0.6880 - recall: 0.3724 - val_loss: 0.6868 - val_recall: 0.3902
Epoch 26/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step - loss: 0.6874 - recall: 0.3717 - val_loss: 0.6863 - val_recall: 0.3902
Epoch 27/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 156ms/step - loss: 0.6868 - recall: 0.3689 - val_loss: 0.6857 - val_recall: 0.3803
Epoch 28/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 115ms/step - loss: 0.6863 - recall: 0.3661 - val_loss: 0.6852 - val_recall: 0.3770
Epoch 29/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 136ms/step - loss: 0.6857 - recall: 0.3640 - val_loss: 0.6846 - val_recall: 0.3705
Epoch 30/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step - loss: 0.6852 - recall: 0.3612 - val_loss: 0.6840 - val_recall: 0.3639
Epoch 31/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - loss: 0.6846 - recall: 0.3576 - val_loss: 0.6835 - val_recall: 0.3574
Epoch 32/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 78ms/step - loss: 0.6841 - recall: 0.3541 - val_loss: 0.6830 - val_recall: 0.3574
Epoch 33/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 136ms/step - loss: 0.6835 - recall: 0.3527 - val_loss: 0.6824 - val_recall: 0.3541
Epoch 34/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 122ms/step - loss: 0.6830 - recall: 0.3506 - val_loss: 0.6819 - val_recall: 0.3508
Epoch 35/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step - loss: 0.6824 - recall: 0.3499 - val_loss: 0.6813 - val_recall: 0.3475
Epoch 36/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 82ms/step - loss: 0.6819 - recall: 0.3478 - val_loss: 0.6808 - val_recall: 0.3410
Epoch 37/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 126ms/step - loss: 0.6814 - recall: 0.3457 - val_loss: 0.6802 - val_recall: 0.3344
Epoch 38/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - loss: 0.6808 - recall: 0.3429 - val_loss: 0.6797 - val_recall: 0.3279
Epoch 39/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - loss: 0.6803 - recall: 0.3422 - val_loss: 0.6792 - val_recall: 0.3279
Epoch 40/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 68ms/step - loss: 0.6798 - recall: 0.3408 - val_loss: 0.6786 - val_recall: 0.3246
Epoch 41/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 151ms/step - loss: 0.6792 - recall: 0.3401 - val_loss: 0.6781 - val_recall: 0.3213
Epoch 42/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - loss: 0.6787 - recall: 0.3366 - val_loss: 0.6776 - val_recall: 0.3213
Epoch 43/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step - loss: 0.6782 - recall: 0.3338 - val_loss: 0.6770 - val_recall: 0.3180
Epoch 44/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 91ms/step - loss: 0.6776 - recall: 0.3310 - val_loss: 0.6765 - val_recall: 0.3180
Epoch 45/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 117ms/step - loss: 0.6771 - recall: 0.3275 - val_loss: 0.6760 - val_recall: 0.3115
Epoch 46/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 63ms/step - loss: 0.6766 - recall: 0.3254 - val_loss: 0.6755 - val_recall: 0.3115
Epoch 47/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 81ms/step - loss: 0.6761 - recall: 0.3219 - val_loss: 0.6749 - val_recall: 0.3115
Epoch 48/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 118ms/step - loss: 0.6755 - recall: 0.3205 - val_loss: 0.6744 - val_recall: 0.3115
Epoch 49/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - loss: 0.6750 - recall: 0.3170 - val_loss: 0.6739 - val_recall: 0.3115
Epoch 50/50
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 141ms/step - loss: 0.6745 - recall: 0.3142 - val_loss: 0.6734 - val_recall: 0.3115
In [2187]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  5.44616436958313
In [2188]:
model_02.evaluate(X_train,y_train)
219/219 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.6747 - recall: 0.3219
Out[2188]:
[0.6739842891693115, 0.3106591999530792]
In [2189]:
model_02.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6719 - recall: 0.3088
Out[2189]:
[0.6733905673027039, 0.31147539615631104]

Loss function

In [2190]:
# Plotting Train Loss vs Validation Loss
plot(history_02, 'loss')

Recall

In [2191]:
# Plotting Train recall vs Validation recall
plot(history_02, 'recall')
In [2192]:
# Predicting the results using best as a threshold
y_train_pred = model_02.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[2192]:
array([[False],
       [ True],
       [False],
       ...,
       [False],
       [False],
       [ True]])
In [2193]:
# Predicting the results using best as a threshold
y_val_pred = model_02.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[2193]:
array([[ True],
       [ True],
       [ True],
       ...,
       [False],
       [False],
       [False]])
In [2194]:
model_name = "NN with SGD - 2 Hidden Layers [64,32]"

train_metric_df.loc[model_name] = recall_score(y_train, y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val, y_val_pred)
In [2195]:
results.loc[model_name]=['2','64,32','relu,relu',epochs,batch_size,'SGD',(end-start),history_02.history["loss"][-1],history_02.history["val_loss"][-1],history_02.history["recall"][-1],history_02.history["val_recall"][-1]]
In [2196]:
results
Out[2196]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475

Classification report

In [2197]:
# Classification report
cr = classification_report(y_train, y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.78      0.64      0.71      5574
         1.0       0.18      0.31      0.23      1426

    accuracy                           0.57      7000
   macro avg       0.48      0.48      0.47      7000
weighted avg       0.66      0.57      0.61      7000

In [2198]:
# Classification report
cr = classification_report(y_val, y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.79      0.64      0.71      1195
         1.0       0.18      0.31      0.23       305

    accuracy                           0.57      1500
   macro avg       0.48      0.48      0.47      1500
weighted avg       0.66      0.57      0.61      1500

Confusion matrix

In [2199]:
make_confusion_matrix(y_train, y_train_pred)
In [2200]:
make_confusion_matrix(y_val, y_val_pred)
  • Increasing the number of neurons reduced the Recall drastically. It also reduced the loss. It pretty much drained the life out of the base model.

Neural Network with Adam Optimizer

  • 2 Hidden Layers [64,32 neurons]
In [2201]:
# Clear the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2202]:
# Initializing the neural network
model_1 = Sequential()
model_1.add(Dense(64,activation='relu',input_dim = X_train.shape[1]))
model_1.add(Dense(32,activation='relu'))
model_1.add(Dense(1, activation = 'sigmoid'))
In [2203]:
# Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
#metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2204]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_1.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2205]:
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 64)                  │             768 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,881 (11.25 KB)
 Trainable params: 2,881 (11.25 KB)
 Non-trainable params: 0 (0.00 B)
In [2206]:
epochs = 50
batch_size = 128
In [2207]:
# Get start time
start = time.time()

# Fitting the ANN
history_1 = model_1.fit(
    X_train,y_train,
    batch_size=batch_size,
    validation_data=(X_val,y_val),
    epochs=epochs,
    verbose=1
)

# Get end time
end=time.time()
Epoch 1/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.5659 - recall: 0.1052 - val_loss: 0.4403 - val_recall: 0.0426
Epoch 2/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4317 - recall: 0.1288 - val_loss: 0.4094 - val_recall: 0.2525
Epoch 3/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4039 - recall: 0.2840 - val_loss: 0.3859 - val_recall: 0.3639
Epoch 4/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3843 - recall: 0.3718 - val_loss: 0.3699 - val_recall: 0.4361
Epoch 5/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3720 - recall: 0.4214 - val_loss: 0.3607 - val_recall: 0.4590
Epoch 6/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3647 - recall: 0.4368 - val_loss: 0.3550 - val_recall: 0.4754
Epoch 7/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3600 - recall: 0.4459 - val_loss: 0.3516 - val_recall: 0.4820
Epoch 8/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3565 - recall: 0.4499 - val_loss: 0.3492 - val_recall: 0.4984
Epoch 9/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3538 - recall: 0.4535 - val_loss: 0.3473 - val_recall: 0.4885
Epoch 10/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3515 - recall: 0.4601 - val_loss: 0.3458 - val_recall: 0.4951
Epoch 11/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3495 - recall: 0.4613 - val_loss: 0.3446 - val_recall: 0.4984
Epoch 12/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3477 - recall: 0.4648 - val_loss: 0.3435 - val_recall: 0.5016
Epoch 13/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3461 - recall: 0.4695 - val_loss: 0.3425 - val_recall: 0.4984
Epoch 14/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3445 - recall: 0.4696 - val_loss: 0.3416 - val_recall: 0.4951
Epoch 15/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3431 - recall: 0.4744 - val_loss: 0.3410 - val_recall: 0.4984
Epoch 16/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3417 - recall: 0.4757 - val_loss: 0.3405 - val_recall: 0.4951
Epoch 17/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3406 - recall: 0.4790 - val_loss: 0.3401 - val_recall: 0.4984
Epoch 18/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3393 - recall: 0.4781 - val_loss: 0.3397 - val_recall: 0.4951
Epoch 19/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3380 - recall: 0.4771 - val_loss: 0.3394 - val_recall: 0.4951
Epoch 20/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3369 - recall: 0.4813 - val_loss: 0.3392 - val_recall: 0.4918
Epoch 21/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3357 - recall: 0.4823 - val_loss: 0.3390 - val_recall: 0.4951
Epoch 22/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3346 - recall: 0.4841 - val_loss: 0.3389 - val_recall: 0.4918
Epoch 23/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - loss: 0.3336 - recall: 0.4861 - val_loss: 0.3386 - val_recall: 0.4918
Epoch 24/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3325 - recall: 0.4895 - val_loss: 0.3384 - val_recall: 0.4918
Epoch 25/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3316 - recall: 0.4869 - val_loss: 0.3384 - val_recall: 0.4951
Epoch 26/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3306 - recall: 0.4903 - val_loss: 0.3384 - val_recall: 0.4951
Epoch 27/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3297 - recall: 0.4889 - val_loss: 0.3384 - val_recall: 0.4951
Epoch 28/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - loss: 0.3287 - recall: 0.4864 - val_loss: 0.3385 - val_recall: 0.4984
Epoch 29/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3277 - recall: 0.4875 - val_loss: 0.3385 - val_recall: 0.4984
Epoch 30/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - loss: 0.3268 - recall: 0.4876 - val_loss: 0.3385 - val_recall: 0.4984
Epoch 31/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3258 - recall: 0.4885 - val_loss: 0.3384 - val_recall: 0.5016
Epoch 32/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3248 - recall: 0.4916 - val_loss: 0.3384 - val_recall: 0.5049
Epoch 33/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3238 - recall: 0.4919 - val_loss: 0.3386 - val_recall: 0.5049
Epoch 34/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3229 - recall: 0.4943 - val_loss: 0.3387 - val_recall: 0.5049
Epoch 35/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3220 - recall: 0.4955 - val_loss: 0.3389 - val_recall: 0.5049
Epoch 36/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3212 - recall: 0.4965 - val_loss: 0.3390 - val_recall: 0.5049
Epoch 37/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - loss: 0.3203 - recall: 0.4980 - val_loss: 0.3392 - val_recall: 0.5049
Epoch 38/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3196 - recall: 0.5056 - val_loss: 0.3393 - val_recall: 0.5049
Epoch 39/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3188 - recall: 0.5074 - val_loss: 0.3395 - val_recall: 0.5115
Epoch 40/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3181 - recall: 0.5068 - val_loss: 0.3397 - val_recall: 0.5115
Epoch 41/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3173 - recall: 0.5075 - val_loss: 0.3398 - val_recall: 0.5115
Epoch 42/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3166 - recall: 0.5076 - val_loss: 0.3398 - val_recall: 0.5082
Epoch 43/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3158 - recall: 0.5120 - val_loss: 0.3400 - val_recall: 0.5082
Epoch 44/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3152 - recall: 0.5139 - val_loss: 0.3401 - val_recall: 0.5049
Epoch 45/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3144 - recall: 0.5158 - val_loss: 0.3403 - val_recall: 0.5049
Epoch 46/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3136 - recall: 0.5143 - val_loss: 0.3405 - val_recall: 0.5049
Epoch 47/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3129 - recall: 0.5135 - val_loss: 0.3407 - val_recall: 0.5049
Epoch 48/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3122 - recall: 0.5172 - val_loss: 0.3409 - val_recall: 0.5049
Epoch 49/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3115 - recall: 0.5174 - val_loss: 0.3411 - val_recall: 0.5049
Epoch 50/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3107 - recall: 0.5178 - val_loss: 0.3412 - val_recall: 0.5016
In [2208]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  16.75520086288452
In [2209]:
model_1.evaluate(X_train,y_train)
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.2952 - recall: 0.5110
Out[2209]:
[0.29657599329948425, 0.5308555364608765]
In [2210]:
model_1.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.3401 - recall: 0.4880
Out[2210]:
[0.3412121534347534, 0.5016393661499023]

Loss function

In [2211]:
# Plotting Train Loss vs Validation Loss
plot(history_1, 'loss')

Recall

In [2212]:
# Plotting Train recall vs Validation recall
plot(history_1, 'recall')
In [2213]:
#Predicting the results using 0.5 as the threshold
y_train_pred = model_1.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2213]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])
In [2214]:
#Predicting the results using 0.5 as the threshold
y_val_pred = model_1.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2214]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])
In [2215]:
model_name = "NN with Adam - 2 Hidden Layers [64,32]"

train_metric_df.loc[model_name] = recall_score(y_train,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)
In [2216]:
results.loc[model_name]=['2','64,32','relu,relu',epochs,batch_size,'Adam',(end-start),history_1.history["loss"][-1],history_1.history["val_loss"][-1],history_1.history["recall"][-1],history_1.history["val_recall"][-1]]
In [2217]:
results
Out[2217]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639

Classification report

In [2218]:
#lassification report
cr=classification_report(y_train,y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.89      0.96      0.93      5574
         1.0       0.79      0.53      0.64      1426

    accuracy                           0.88      7000
   macro avg       0.84      0.75      0.78      7000
weighted avg       0.87      0.88      0.87      7000

In [2219]:
#classification report
cr=classification_report(y_val, y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.88      0.95      0.92      1195
         1.0       0.74      0.50      0.60       305

    accuracy                           0.86      1500
   macro avg       0.81      0.73      0.76      1500
weighted avg       0.85      0.86      0.85      1500

Confusion matrix

In [2220]:
#Calculating the confusion matrix
make_confusion_matrix(y_train, y_train_pred)
In [2221]:
#Calculating the confusion matrix
make_confusion_matrix(y_val, y_val_pred)
  • The Recall on this model improved over the last model and the loss reduced further. Nothing significant versus the last model.
In [2222]:
results
Out[2222]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639

Neural Network with Adam Optimizer and Dropout

  • 2 Hidden Layers [64,32 neurons]
In [2328]:
# Clear up the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2329]:
# Initializing the neural network
model_2 = Sequential()

# Add the input layer with neurons and relu as activation function
model_2.add(Dense(64,activation='relu',input_dim = X_train.shape[1]))

# Add dropout with ratio of 0.2 or any suitable value.
model_2.add(Dropout(0.2))

# Add a hidden layer
model_2.add(Dense(32,activation='tanh'))

# Add dropout with ratio of 0.1 or any suitable value.
model_2.add(Dropout(0.2))

# Add the number of neurons required in the output layer.
model_2.add(Dense(1, activation = 'sigmoid'))
In [2330]:
# Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2331]:
## Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_2.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2332]:
# Summary of the model
model_2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 64)                  │             768 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 64)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           2,080 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,881 (11.25 KB)
 Trainable params: 2,881 (11.25 KB)
 Non-trainable params: 0 (0.00 B)
In [2333]:
epochs = 50
batch_size = 128
In [2334]:
# Get start time
start = time.time()

# Fitting the ANN with batch_size = 32 and 100 epochs
history_2 = model_2.fit(
    X_train,y_train,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    validation_data=(X_val,y_val)
)

# Get end time
end=time.time()
Epoch 1/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.6177 - recall: 0.2420 - val_loss: 0.4438 - val_recall: 0.0459
Epoch 2/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4509 - recall: 0.1207 - val_loss: 0.4236 - val_recall: 0.1443
Epoch 3/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.4383 - recall: 0.2005 - val_loss: 0.4096 - val_recall: 0.2393
Epoch 4/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4232 - recall: 0.2682 - val_loss: 0.3963 - val_recall: 0.3213
Epoch 5/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4087 - recall: 0.3059 - val_loss: 0.3849 - val_recall: 0.3443
Epoch 6/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3971 - recall: 0.3373 - val_loss: 0.3741 - val_recall: 0.3934
Epoch 7/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3875 - recall: 0.3746 - val_loss: 0.3660 - val_recall: 0.4131
Epoch 8/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3856 - recall: 0.3930 - val_loss: 0.3594 - val_recall: 0.4262
Epoch 9/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3808 - recall: 0.4082 - val_loss: 0.3544 - val_recall: 0.4492
Epoch 10/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3818 - recall: 0.4214 - val_loss: 0.3524 - val_recall: 0.4656
Epoch 11/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3746 - recall: 0.4209 - val_loss: 0.3501 - val_recall: 0.4623
Epoch 12/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3739 - recall: 0.4225 - val_loss: 0.3474 - val_recall: 0.4656
Epoch 13/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3739 - recall: 0.4253 - val_loss: 0.3466 - val_recall: 0.4623
Epoch 14/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3709 - recall: 0.4107 - val_loss: 0.3450 - val_recall: 0.4721
Epoch 15/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3650 - recall: 0.4412 - val_loss: 0.3436 - val_recall: 0.4820
Epoch 16/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3655 - recall: 0.4298 - val_loss: 0.3433 - val_recall: 0.4721
Epoch 17/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3655 - recall: 0.4415 - val_loss: 0.3431 - val_recall: 0.4754
Epoch 18/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3745 - recall: 0.4330 - val_loss: 0.3426 - val_recall: 0.4689
Epoch 19/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3609 - recall: 0.4351 - val_loss: 0.3421 - val_recall: 0.4787
Epoch 20/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3639 - recall: 0.4468 - val_loss: 0.3411 - val_recall: 0.4820
Epoch 21/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3581 - recall: 0.4517 - val_loss: 0.3402 - val_recall: 0.4754
Epoch 22/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3598 - recall: 0.4386 - val_loss: 0.3383 - val_recall: 0.4787
Epoch 23/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3629 - recall: 0.4275 - val_loss: 0.3388 - val_recall: 0.4787
Epoch 24/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3601 - recall: 0.4521 - val_loss: 0.3378 - val_recall: 0.4951
Epoch 25/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3613 - recall: 0.4600 - val_loss: 0.3370 - val_recall: 0.4885
Epoch 26/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3587 - recall: 0.4543 - val_loss: 0.3370 - val_recall: 0.4918
Epoch 27/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3548 - recall: 0.4555 - val_loss: 0.3374 - val_recall: 0.4984
Epoch 28/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.3554 - recall: 0.4639 - val_loss: 0.3361 - val_recall: 0.4984
Epoch 29/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3565 - recall: 0.4495 - val_loss: 0.3363 - val_recall: 0.4787
Epoch 30/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3564 - recall: 0.4514 - val_loss: 0.3363 - val_recall: 0.4754
Epoch 31/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3493 - recall: 0.4679 - val_loss: 0.3350 - val_recall: 0.4951
Epoch 32/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3587 - recall: 0.4388 - val_loss: 0.3349 - val_recall: 0.4787
Epoch 33/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3565 - recall: 0.4647 - val_loss: 0.3345 - val_recall: 0.4918
Epoch 34/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3555 - recall: 0.4516 - val_loss: 0.3345 - val_recall: 0.4984
Epoch 35/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3531 - recall: 0.4603 - val_loss: 0.3344 - val_recall: 0.4918
Epoch 36/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3541 - recall: 0.4379 - val_loss: 0.3345 - val_recall: 0.4852
Epoch 37/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3592 - recall: 0.4543 - val_loss: 0.3339 - val_recall: 0.4918
Epoch 38/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3519 - recall: 0.4456 - val_loss: 0.3341 - val_recall: 0.5049
Epoch 39/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3493 - recall: 0.4584 - val_loss: 0.3335 - val_recall: 0.4984
Epoch 40/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3513 - recall: 0.4581 - val_loss: 0.3333 - val_recall: 0.4885
Epoch 41/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3501 - recall: 0.4649 - val_loss: 0.3328 - val_recall: 0.4885
Epoch 42/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3475 - recall: 0.4537 - val_loss: 0.3321 - val_recall: 0.5016
Epoch 43/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3501 - recall: 0.4662 - val_loss: 0.3321 - val_recall: 0.5049
Epoch 44/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3503 - recall: 0.4757 - val_loss: 0.3313 - val_recall: 0.4918
Epoch 45/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3458 - recall: 0.4683 - val_loss: 0.3305 - val_recall: 0.4984
Epoch 46/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3456 - recall: 0.4798 - val_loss: 0.3316 - val_recall: 0.4951
Epoch 47/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3478 - recall: 0.4691 - val_loss: 0.3307 - val_recall: 0.4918
Epoch 48/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3497 - recall: 0.4647 - val_loss: 0.3304 - val_recall: 0.5016
Epoch 49/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3455 - recall: 0.4786 - val_loss: 0.3309 - val_recall: 0.4951
Epoch 50/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3457 - recall: 0.4669 - val_loss: 0.3309 - val_recall: 0.4951
In [2335]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  16.37343120574951

Loss function

In [2336]:
# Plotting Train Loss vs Validation Loss
plot(history_2, 'loss')
In [2337]:
# Plotting Train recall vs Validation recall
plot(history_2, 'recall')
In [2338]:
#Predicting the results using best as a threshold
y_train_pred = model_2.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step
Out[2338]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])
In [2339]:
#Predicting the results using 0.5 as the threshold.
y_val_pred = model_2.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[2339]:
array([[False],
       [False],
       [False],
       ...,
       [ True],
       [False],
       [False]])
In [2340]:
model_name = "NN with Adam & Dropout - 2 Hidden Layers [64,32]"

train_metric_df.loc[model_name] = recall_score(y_train,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)
In [2341]:
results.loc[model_name]=['2','64,32,','relu,tanh',epochs,batch_size,'Adam',(end-start),history_2.history["loss"][-1],history_2.history["val_loss"][-1],history_2.history["recall"][-1],history_2.history["val_recall"][-1]]
In [2347]:
results
Out[2347]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] 4 32,32,32,32 relu,tanh,relu,relu 50 128 Adam 16.678700 0.334250 0.335369 0.462833 0.495082
NN with Adam & Dropout - 2 Hidden Layers [64,32] 2 64,32, relu,tanh 50 128 Adam 16.373431 0.335301 0.330920 0.481767 0.495082
NN with SMOTE & SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 25 128 Adam 7.675946 0.665111 0.696225 0.751884 0.793443
NN with SMOTE & Adam - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.293055 0.366624 0.425441 0.831539 0.793443
NN with SMOTE,Adam & Dropout - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.655809 0.525529 0.502035 0.781665 0.763934

Classification report

In [2343]:
#classification report
cr=classification_report(y_train,y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.88      0.97      0.92      5574
         1.0       0.79      0.48      0.59      1426

    accuracy                           0.87      7000
   macro avg       0.83      0.72      0.76      7000
weighted avg       0.86      0.87      0.85      7000

In [2344]:
#classification report
cr = classification_report(y_val,y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.88      0.96      0.92      1195
         1.0       0.78      0.50      0.61       305

    accuracy                           0.87      1500
   macro avg       0.83      0.73      0.76      1500
weighted avg       0.86      0.87      0.86      1500

Confusion matrix

In [2345]:
#Calculating the confusion matrix
make_confusion_matrix(y_train, y_train_pred)
In [2346]:
#Calculating the confusion matrix
make_confusion_matrix(y_val,y_val_pred)
  • The addition of the Dropout parameter dis not seem to make a difference as the results are similar.

Neural Network with Adam Optimizer and Dropout

  • 4 Hidden Layers [32,32,32,32 neurons]
  • Added to see how a 4 Hidden-Layers model will do
In [2223]:
# Clear up the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2224]:
# Initializing the neural network
model_21 = Sequential()

# Add the input layer with neurons and relu as activation function
model_21.add(Dense(32,activation='relu',input_dim = X_train.shape[1]))

# Add dropout with ratio of 0.2 or any suitable value.
model_21.add(Dropout(0.2))

# Add a hidden layer
model_21.add(Dense(32,activation='tanh'))

# Add a hidden layer
model_21.add(Dense(32,activation='relu'))

# Add dropout with ratio of 0.1 or any suitable value.
model_21.add(Dropout(0.1))

# Add a hidden layer
model_21.add(Dense(32,activation='relu'))

# Add the number of neurons required in the output layer.
model_21.add(Dense(1, activation = 'sigmoid'))
In [2225]:
# Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2226]:
## Complete the code to compile the model with binary cross entropy as loss function and recall as the metric.
model_21.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2227]:
# Summary of the model
model_21.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_4 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 3,585 (14.00 KB)
 Trainable params: 3,585 (14.00 KB)
 Non-trainable params: 0 (0.00 B)
In [2228]:
epochs = 50
batch_size = 128
In [2229]:
# Get start time
start = time.time()

# Fitting the ANN with batch_size = 32 and 100 epochs
history_21 = model_21.fit(
    X_train,y_train,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    validation_data=(X_val,y_val)
)

# Get end time
end=time.time()
Epoch 1/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - loss: 0.5923 - recall: 0.1641 - val_loss: 0.4587 - val_recall: 0.0000e+00
Epoch 2/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4580 - recall: 0.0170 - val_loss: 0.4263 - val_recall: 0.1410
Epoch 3/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4336 - recall: 0.1744 - val_loss: 0.3990 - val_recall: 0.3049
Epoch 4/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - loss: 0.4096 - recall: 0.2833 - val_loss: 0.3729 - val_recall: 0.4328
Epoch 5/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3977 - recall: 0.3614 - val_loss: 0.3585 - val_recall: 0.4328
Epoch 6/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3860 - recall: 0.3548 - val_loss: 0.3548 - val_recall: 0.4525
Epoch 7/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3871 - recall: 0.3874 - val_loss: 0.3498 - val_recall: 0.4754
Epoch 8/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3761 - recall: 0.4073 - val_loss: 0.3455 - val_recall: 0.4820
Epoch 9/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - loss: 0.3720 - recall: 0.4048 - val_loss: 0.3441 - val_recall: 0.4984
Epoch 10/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - loss: 0.3727 - recall: 0.4279 - val_loss: 0.3430 - val_recall: 0.4918
Epoch 11/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.3680 - recall: 0.4225 - val_loss: 0.3428 - val_recall: 0.4721
Epoch 12/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3713 - recall: 0.4061 - val_loss: 0.3413 - val_recall: 0.5082
Epoch 13/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3704 - recall: 0.4162 - val_loss: 0.3392 - val_recall: 0.5180
Epoch 14/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3702 - recall: 0.4304 - val_loss: 0.3380 - val_recall: 0.4885
Epoch 15/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3616 - recall: 0.4315 - val_loss: 0.3372 - val_recall: 0.5049
Epoch 16/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3583 - recall: 0.4429 - val_loss: 0.3369 - val_recall: 0.4984
Epoch 17/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3640 - recall: 0.4250 - val_loss: 0.3382 - val_recall: 0.5016
Epoch 18/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3633 - recall: 0.4414 - val_loss: 0.3375 - val_recall: 0.4951
Epoch 19/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3566 - recall: 0.4327 - val_loss: 0.3367 - val_recall: 0.4852
Epoch 20/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3573 - recall: 0.4342 - val_loss: 0.3360 - val_recall: 0.5213
Epoch 21/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3563 - recall: 0.4552 - val_loss: 0.3365 - val_recall: 0.4984
Epoch 22/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3506 - recall: 0.4437 - val_loss: 0.3372 - val_recall: 0.5082
Epoch 23/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3586 - recall: 0.4391 - val_loss: 0.3367 - val_recall: 0.5082
Epoch 24/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3597 - recall: 0.4369 - val_loss: 0.3375 - val_recall: 0.5115
Epoch 25/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3590 - recall: 0.4548 - val_loss: 0.3375 - val_recall: 0.5016
Epoch 26/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3516 - recall: 0.4444 - val_loss: 0.3355 - val_recall: 0.4918
Epoch 27/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3500 - recall: 0.4467 - val_loss: 0.3358 - val_recall: 0.4885
Epoch 28/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3501 - recall: 0.4471 - val_loss: 0.3343 - val_recall: 0.5016
Epoch 29/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3498 - recall: 0.4476 - val_loss: 0.3350 - val_recall: 0.5016
Epoch 30/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3501 - recall: 0.4392 - val_loss: 0.3324 - val_recall: 0.5115
Epoch 31/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3482 - recall: 0.4548 - val_loss: 0.3347 - val_recall: 0.5049
Epoch 32/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3497 - recall: 0.4502 - val_loss: 0.3347 - val_recall: 0.5148
Epoch 33/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3515 - recall: 0.4315 - val_loss: 0.3340 - val_recall: 0.4885
Epoch 34/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3484 - recall: 0.4324 - val_loss: 0.3320 - val_recall: 0.5180
Epoch 35/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3492 - recall: 0.4332 - val_loss: 0.3336 - val_recall: 0.5016
Epoch 36/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3463 - recall: 0.4513 - val_loss: 0.3341 - val_recall: 0.4885
Epoch 37/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3451 - recall: 0.4491 - val_loss: 0.3328 - val_recall: 0.4984
Epoch 38/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3435 - recall: 0.4458 - val_loss: 0.3335 - val_recall: 0.5115
Epoch 39/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3456 - recall: 0.4589 - val_loss: 0.3329 - val_recall: 0.5049
Epoch 40/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3405 - recall: 0.4519 - val_loss: 0.3337 - val_recall: 0.5115
Epoch 41/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3439 - recall: 0.4549 - val_loss: 0.3339 - val_recall: 0.5148
Epoch 42/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3461 - recall: 0.4695 - val_loss: 0.3356 - val_recall: 0.5016
Epoch 43/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - loss: 0.3471 - recall: 0.4519 - val_loss: 0.3338 - val_recall: 0.4787
Epoch 44/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3449 - recall: 0.4461 - val_loss: 0.3335 - val_recall: 0.4689
Epoch 45/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3449 - recall: 0.4418 - val_loss: 0.3344 - val_recall: 0.5082
Epoch 46/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3462 - recall: 0.4541 - val_loss: 0.3329 - val_recall: 0.5016
Epoch 47/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3451 - recall: 0.4503 - val_loss: 0.3337 - val_recall: 0.4820
Epoch 48/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3493 - recall: 0.4341 - val_loss: 0.3373 - val_recall: 0.4984
Epoch 49/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3448 - recall: 0.4366 - val_loss: 0.3360 - val_recall: 0.4918
Epoch 50/50
55/55 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3456 - recall: 0.4491 - val_loss: 0.3354 - val_recall: 0.4951
In [2230]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  16.678699731826782

Loss function

In [2231]:
# Plotting Train Loss vs Validation Loss
plot(history_21, 'loss')
In [2232]:
# Plotting Train recall vs Validation recall
plot(history_21, 'recall')
In [2233]:
#Predicting the results using best as a threshold
y_train_pred = model_21.predict(X_train)
y_train_pred = (y_train_pred > 0.5)
y_train_pred
219/219 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[2233]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])
In [2234]:
#Predicting the results using 0.5 as the threshold.
y_val_pred = model_21.predict(X_val)
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[2234]:
array([[False],
       [False],
       [False],
       ...,
       [False],
       [False],
       [False]])
In [2235]:
model_name = "NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32]"

train_metric_df.loc[model_name] = recall_score(y_train,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)
In [2236]:
results.loc[model_name]=['4','32,32,32,32','relu,tanh,relu,relu',epochs,batch_size,'Adam',(end-start),history_21.history["loss"][-1],history_21.history["val_loss"][-1],history_21.history["recall"][-1],history_21.history["val_recall"][-1]]
In [2237]:
results
Out[2237]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] 4 32,32,32,32 relu,tanh,relu,relu 50 128 Adam 16.678700 0.334250 0.335369 0.462833 0.495082

Classification report

In [2238]:
#classification report
cr=classification_report(y_train,y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.88      0.97      0.92      5574
         1.0       0.79      0.49      0.61      1426

    accuracy                           0.87      7000
   macro avg       0.84      0.73      0.76      7000
weighted avg       0.86      0.87      0.86      7000

In [2239]:
#classification report
cr = classification_report(y_val,y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.88      0.96      0.92      1195
         1.0       0.76      0.50      0.60       305

    accuracy                           0.87      1500
   macro avg       0.82      0.73      0.76      1500
weighted avg       0.86      0.87      0.85      1500

Confusion matrix

In [2240]:
#Calculating the confusion matrix
make_confusion_matrix(y_train, y_train_pred)
In [2241]:
#Calculating the confusion matrix
make_confusion_matrix(y_val,y_val_pred)
  • This model performed a tad worse than the previous model and the loss is similar.
  • This model performed similar to the previous one. There is not much improvement.

Neural Network with Balanced Data (by applying SMOTE) and SGD Optimizer

  • 2 Hidden Layers [14,7 neurons]

Let's try to apply SMOTE to balance this dataset and then again apply hyperparamter tuning accordingly.

In [2261]:
sm  = SMOTE(random_state=42)
# Complete the code to fit SMOTE on the training data.
X_train_smote, y_train_smote= sm.fit_resample(X_train, y_train)
print('After UpSampling, the shape of train_X: {}'.format(X_train_smote.shape))
print('After UpSampling, the shape of train_y: {} \n'.format(y_train_smote.shape))
After UpSampling, the shape of train_X: (11148, 11)
After UpSampling, the shape of train_y: (11148,) 

In [2262]:
y_train_smote.value_counts()
Out[2262]:
count
Exited
0.0 5574
1.0 5574

Let's build a model with the balanced dataset

In [2263]:
# Clear the session memory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2264]:
# Initializing the model
model_3 = Sequential()
# Add a input layer
model_3.add(Dense(14,activation='relu',input_dim = X_train_smote.shape[1]))
# Add a hidden layer
model_3.add(Dense(7,activation='relu'))
# Add the required number of neurons in the output layer with a sigmoid activation function.
model_3.add(Dense(1, activation = 'sigmoid'))
In [2265]:
#Complete the code to use SGD as the optimizer.
optimizer = tf.keras.optimizers.SGD(0.001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2266]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_3.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2267]:
model_3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 14)                  │             168 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 7)                   │             105 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 1)                   │               8 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 281 (1.10 KB)
 Trainable params: 281 (1.10 KB)
 Non-trainable params: 0 (0.00 B)
In [2268]:
epochs = 25
batch_size = 128
In [2269]:
# Get start time
start = time.time()

# Fitting the ANN
history_3 = model_3.fit(
    X_train_smote, y_train_smote,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    validation_data = (X_val,y_val)
)

# Get end time
end=time.time()
Epoch 1/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - loss: 0.7130 - recall: 0.9547 - val_loss: 0.8416 - val_recall: 0.9705
Epoch 2/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.7085 - recall: 0.9503 - val_loss: 0.8293 - val_recall: 0.9639
Epoch 3/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.7045 - recall: 0.9430 - val_loss: 0.8180 - val_recall: 0.9607
Epoch 4/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.7009 - recall: 0.9340 - val_loss: 0.8077 - val_recall: 0.9574
Epoch 5/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6977 - recall: 0.9269 - val_loss: 0.7981 - val_recall: 0.9475
Epoch 6/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6947 - recall: 0.9176 - val_loss: 0.7893 - val_recall: 0.9311
Epoch 7/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6921 - recall: 0.9077 - val_loss: 0.7811 - val_recall: 0.9213
Epoch 8/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6896 - recall: 0.8969 - val_loss: 0.7735 - val_recall: 0.9180
Epoch 9/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6874 - recall: 0.8877 - val_loss: 0.7664 - val_recall: 0.9082
Epoch 10/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6853 - recall: 0.8755 - val_loss: 0.7598 - val_recall: 0.8852
Epoch 11/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6834 - recall: 0.8657 - val_loss: 0.7536 - val_recall: 0.8787
Epoch 12/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6816 - recall: 0.8572 - val_loss: 0.7478 - val_recall: 0.8787
Epoch 13/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6799 - recall: 0.8447 - val_loss: 0.7424 - val_recall: 0.8525
Epoch 14/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6783 - recall: 0.8370 - val_loss: 0.7372 - val_recall: 0.8426
Epoch 15/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6769 - recall: 0.8289 - val_loss: 0.7325 - val_recall: 0.8393
Epoch 16/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6755 - recall: 0.8241 - val_loss: 0.7279 - val_recall: 0.8361
Epoch 17/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6741 - recall: 0.8156 - val_loss: 0.7236 - val_recall: 0.8328
Epoch 18/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6729 - recall: 0.8065 - val_loss: 0.7196 - val_recall: 0.8262
Epoch 19/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6717 - recall: 0.7990 - val_loss: 0.7158 - val_recall: 0.8230
Epoch 20/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6705 - recall: 0.7915 - val_loss: 0.7121 - val_recall: 0.8164
Epoch 21/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6694 - recall: 0.7845 - val_loss: 0.7086 - val_recall: 0.8066
Epoch 22/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6683 - recall: 0.7765 - val_loss: 0.7053 - val_recall: 0.8033
Epoch 23/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6673 - recall: 0.7712 - val_loss: 0.7021 - val_recall: 0.8033
Epoch 24/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6663 - recall: 0.7653 - val_loss: 0.6991 - val_recall: 0.8000
Epoch 25/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6653 - recall: 0.7619 - val_loss: 0.6962 - val_recall: 0.7934
In [2270]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  7.67594575881958
In [2271]:
model_3.evaluate(X_train_smote,y_train_smote)
349/349 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.6869 - recall: 0.7583
Out[2271]:
[0.6646227836608887, 0.7481162548065186]
In [2272]:
model_3.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.6997 - recall: 0.8010
Out[2272]:
[0.6962251663208008, 0.7934426069259644]

Loss function

In [2273]:
#Plotting Train Loss vs Validation Loss
plot(history_3, 'loss')
In [2274]:
# Plotting Train recall vs Validation recall
plot(history_3, 'recall')
In [2275]:
y_train_pred = model_3.predict(X_train_smote)
#Predicting the results using 0.5 as the threshold
y_train_pred = (y_train_pred > 0.5)
y_train_pred
349/349 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[2275]:
array([[ True],
       [False],
       [False],
       ...,
       [ True],
       [ True],
       [False]])
In [2276]:
y_val_pred = model_3.predict(X_val)
#Predicting the results using 0.5 as the threshold
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step
Out[2276]:
array([[ True],
       [ True],
       [False],
       ...,
       [ True],
       [False],
       [False]])
In [2277]:
model_name = "NN with SMOTE & SGD - 2 Hidden Layers [14,7]"

train_metric_df.loc[model_name] = recall_score(y_train_smote,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)
In [2278]:
results.loc[model_name]=['2','14,7','relu,relu',epochs,batch_size,'Adam',(end-start),history_3.history["loss"][-1],history_3.history["val_loss"][-1],history_3.history["recall"][-1],history_3.history["val_recall"][-1]]
In [2279]:
results
Out[2279]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] 4 32,32,32,32 relu,tanh,relu,relu 50 128 Adam 16.678700 0.334250 0.335369 0.462833 0.495082
NN with Adam & Dropout - 2 Hidden Layers [64,32] 2 64,32, relu,tanh 25 64 Adam 11.130464 0.342465 0.332770 0.470547 0.508197
NN with SMOTE & SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 25 128 Adam 7.675946 0.665111 0.696225 0.751884 0.793443

Classification report

In [2280]:
cr=classification_report(y_train_smote,y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.64      0.45      0.53      5574
         1.0       0.58      0.75      0.65      5574

    accuracy                           0.60     11148
   macro avg       0.61      0.60      0.59     11148
weighted avg       0.61      0.60      0.59     11148

In [2281]:
cr=classification_report(y_val,y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.89      0.45      0.59      1195
         1.0       0.27      0.79      0.40       305

    accuracy                           0.52      1500
   macro avg       0.58      0.62      0.50      1500
weighted avg       0.77      0.52      0.55      1500

Confusion matrix

In [2282]:
#Calculating the confusion matrix
make_confusion_matrix(y_train_smote, y_train_pred)
In [2283]:
#Calculating the confusion matrix

make_confusion_matrix(y_val,y_val_pred)
  • This model with balancing the data didn't do badly. The Recall score improved from the last few model but the loss increased.
  • The data balancing helped to improve the performance.

Neural Network with Balanced Data (by applying SMOTE) and Adam Optimizer

  • 3 Hidden Layers [32,32,32 neurons]

Let's build a model with the balanced dataset

In [2284]:
# Clear the session memmory
backend.clear_session()

# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2285]:
# Initializing the model
model_4 = Sequential()
# Add a input layer
model_4.add(Dense(32,activation='relu',input_dim = X_train_smote.shape[1]))
# Add a hidden layer
model_4.add(Dense(32,activation='relu'))
#Add a hidden layer
model_4.add(Dense(32,activation='relu'))
# Add the required number of neurons in the output layer and a suitable activation function.
model_4.add(Dense(1, activation = 'sigmoid'))
In [2286]:
model_4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,529 (9.88 KB)
 Trainable params: 2,529 (9.88 KB)
 Non-trainable params: 0 (0.00 B)
In [2287]:
#Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam()

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2288]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_4.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2289]:
model_4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,529 (9.88 KB)
 Trainable params: 2,529 (9.88 KB)
 Non-trainable params: 0 (0.00 B)
In [2290]:
epochs = 25
batch_size = 128
In [2291]:
# Get start time
start = time.time()


# Fitting the ANN

history_4 = model_4.fit(
    X_train_smote,y_train_smote,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    validation_data = (X_val,y_val)
)

# Get end time
end=time.time()
Epoch 1/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - loss: 0.6599 - recall: 0.4446 - val_loss: 0.5342 - val_recall: 0.6754
Epoch 2/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5209 - recall: 0.7777 - val_loss: 0.4619 - val_recall: 0.7443
Epoch 3/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4723 - recall: 0.8006 - val_loss: 0.4373 - val_recall: 0.7475
Epoch 4/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4565 - recall: 0.8020 - val_loss: 0.4294 - val_recall: 0.7410
Epoch 5/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4464 - recall: 0.8002 - val_loss: 0.4257 - val_recall: 0.7377
Epoch 6/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4389 - recall: 0.8000 - val_loss: 0.4231 - val_recall: 0.7377
Epoch 7/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4328 - recall: 0.8013 - val_loss: 0.4215 - val_recall: 0.7377
Epoch 8/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4277 - recall: 0.8004 - val_loss: 0.4209 - val_recall: 0.7475
Epoch 9/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4230 - recall: 0.7996 - val_loss: 0.4210 - val_recall: 0.7508
Epoch 10/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4187 - recall: 0.8011 - val_loss: 0.4203 - val_recall: 0.7508
Epoch 11/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4148 - recall: 0.8041 - val_loss: 0.4205 - val_recall: 0.7475
Epoch 12/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4111 - recall: 0.8060 - val_loss: 0.4220 - val_recall: 0.7574
Epoch 13/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4075 - recall: 0.8075 - val_loss: 0.4215 - val_recall: 0.7508
Epoch 14/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4042 - recall: 0.8142 - val_loss: 0.4236 - val_recall: 0.7475
Epoch 15/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.4010 - recall: 0.8148 - val_loss: 0.4239 - val_recall: 0.7475
Epoch 16/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3979 - recall: 0.8142 - val_loss: 0.4261 - val_recall: 0.7541
Epoch 17/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3950 - recall: 0.8153 - val_loss: 0.4253 - val_recall: 0.7475
Epoch 18/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3919 - recall: 0.8160 - val_loss: 0.4263 - val_recall: 0.7508
Epoch 19/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3887 - recall: 0.8180 - val_loss: 0.4240 - val_recall: 0.7475
Epoch 20/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3854 - recall: 0.8214 - val_loss: 0.4262 - val_recall: 0.7377
Epoch 21/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3824 - recall: 0.8225 - val_loss: 0.4259 - val_recall: 0.7279
Epoch 22/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3793 - recall: 0.8232 - val_loss: 0.4239 - val_recall: 0.7213
Epoch 23/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3762 - recall: 0.8254 - val_loss: 0.4239 - val_recall: 0.7180
Epoch 24/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.3732 - recall: 0.8259 - val_loss: 0.4260 - val_recall: 0.7115
Epoch 25/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.3704 - recall: 0.8276 - val_loss: 0.4254 - val_recall: 0.7148
In [2292]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  9.293054819107056
In [2293]:
model_4.evaluate(X_train_smote,y_train_smote)
349/349 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.3631 - recall: 0.7861
Out[2293]:
[0.36068224906921387, 0.8227484822273254]
In [2294]:
model_4.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.4329 - recall: 0.6999
Out[2294]:
[0.42544108629226685, 0.7147541046142578]

Loss function

In [2295]:
# Plotting Train Loss vs Validation Loss
plot(history_4, 'loss')
In [2296]:
# Plotting Train recall vs Validation recall
plot(history_4, 'recall')
In [2297]:
y_train_pred = model_4.predict(X_train_smote)
#Predicting the results using 0.5 as the threshold
y_train_pred = (y_train_pred > 0.5)
y_train_pred
349/349 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[2297]:
array([[ True],
       [False],
       [False],
       ...,
       [ True],
       [ True],
       [False]])
In [2298]:
y_val_pred = model_4.predict(X_val)
#Predicting the results using 0.5 as the threshold
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[2298]:
array([[False],
       [False],
       [False],
       ...,
       [ True],
       [False],
       [False]])
In [2299]:
model_name = "NN with SMOTE & Adam - 3 Hidden Layers [32,32,32]"

train_metric_df.loc[model_name] = recall_score(y_train_smote,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)
In [2300]:
results.loc[model_name]=['3','32,32,32','relu,relu,relu',epochs,batch_size,'Adam',(end-start),history_4.history["loss"][-1],history_4.history["val_loss"][-1],history_4.history["recall"][-1],history_3.history["val_recall"][-1]]
In [2301]:
results
Out[2301]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] 4 32,32,32,32 relu,tanh,relu,relu 50 128 Adam 16.678700 0.334250 0.335369 0.462833 0.495082
NN with Adam & Dropout - 2 Hidden Layers [64,32] 2 64,32, relu,tanh 25 64 Adam 11.130464 0.342465 0.332770 0.470547 0.508197
NN with SMOTE & SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 25 128 Adam 7.675946 0.665111 0.696225 0.751884 0.793443
NN with SMOTE & Adam - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.293055 0.366624 0.425441 0.831539 0.793443

Classification report

In [2302]:
cr=classification_report(y_train_smote,y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.83      0.85      0.84      5574
         1.0       0.85      0.82      0.83      5574

    accuracy                           0.84     11148
   macro avg       0.84      0.84      0.84     11148
weighted avg       0.84      0.84      0.84     11148

In [2303]:
cr=classification_report(y_val, y_val_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.92      0.83      0.87      1195
         1.0       0.52      0.71      0.60       305

    accuracy                           0.81      1500
   macro avg       0.72      0.77      0.74      1500
weighted avg       0.84      0.81      0.82      1500

Confusion matrix

In [2304]:
#Calculating the confusion matrix
make_confusion_matrix(y_train_smote, y_train_pred)
In [2305]:
#Calculating the confusion matrix
make_confusion_matrix(y_val,y_val_pred)
  • Again, it looks like the balancing of the data is having a positive effect. This model performed better than the previous balanced data model. It improved both in the Recall score and loss.

Neural Network with Balanced Data (by applying SMOTE), Adam Optimizer, and Dropout

  • 3 Hidden Layers [32,32,32 neurons]
In [2306]:
# Clear the session memory
backend.clear_session()

#Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(2)
random.seed(2)
tf.random.set_seed(2)
In [2307]:
# Initializing the model
model_5 = Sequential()

# Add required # of neurons to the input layer with relu as activation function
model_5.add(Dense(32,activation='relu',input_dim = X_train_smote.shape[1]))
#model_5.add(BatchNormalization())

# Add dropout rate
model_5.add(Dropout(0.2))

# Add required # neurons to the hidden layer with any activation function.
model_5.add(Dense(32,activation='relu'))

#model_5.add(BatchNormalization())

# Add dropout rate.
model_5.add(Dropout(0.3))

# Add hidden layer with neurons with relu as activation function
model_5.add(Dense(32,activation='relu'))

# Add the required number of neurons in the output layer with a suitable activation function.
model_5.add(Dense(1, activation = 'sigmoid'))
In [2308]:
# Complete the code to use Adam as the optimizer.
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)

# uncomment one of the following lines to define the metric to be used
# metric = 'accuracy'
metric = keras.metrics.Recall()
# metric = keras.metrics.Precision()
# metric = keras.metrics.F1Score()
In [2309]:
# Complete the code to compile the model with binary cross entropy as loss function and recall as the metric
model_5.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=[metric])
In [2310]:
model_5.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                          Output Shape                         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense (Dense)                        │ (None, 32)                  │             384 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout)                    │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout_1 (Dropout)                  │ (None, 32)                  │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_2 (Dense)                      │ (None, 32)                  │           1,056 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_3 (Dense)                      │ (None, 1)                   │              33 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2,529 (9.88 KB)
 Trainable params: 2,529 (9.88 KB)
 Non-trainable params: 0 (0.00 B)
In [2311]:
epochs = 25
batch_size = 128
In [2312]:
# Get start time
start = time.time()

history_5 = model_5.fit(
    X_train_smote,y_train_smote,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    validation_data = (X_val,y_val))

# Get end time
end=time.time()
Epoch 1/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - loss: 0.6884 - recall: 0.7374 - val_loss: 0.6686 - val_recall: 0.6328
Epoch 2/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6752 - recall: 0.6240 - val_loss: 0.6430 - val_recall: 0.6033
Epoch 3/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6635 - recall: 0.6058 - val_loss: 0.6220 - val_recall: 0.6131
Epoch 4/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6506 - recall: 0.6083 - val_loss: 0.6076 - val_recall: 0.6328
Epoch 5/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6394 - recall: 0.6385 - val_loss: 0.5945 - val_recall: 0.6492
Epoch 6/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6271 - recall: 0.6765 - val_loss: 0.5821 - val_recall: 0.6623
Epoch 7/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.6221 - recall: 0.6911 - val_loss: 0.5765 - val_recall: 0.6885
Epoch 8/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.6112 - recall: 0.7174 - val_loss: 0.5672 - val_recall: 0.7016
Epoch 9/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5983 - recall: 0.7248 - val_loss: 0.5601 - val_recall: 0.7082
Epoch 10/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5923 - recall: 0.7352 - val_loss: 0.5542 - val_recall: 0.7246
Epoch 11/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5860 - recall: 0.7345 - val_loss: 0.5513 - val_recall: 0.7311
Epoch 12/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5746 - recall: 0.7533 - val_loss: 0.5445 - val_recall: 0.7311
Epoch 13/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5741 - recall: 0.7521 - val_loss: 0.5408 - val_recall: 0.7410
Epoch 14/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5679 - recall: 0.7688 - val_loss: 0.5359 - val_recall: 0.7443
Epoch 15/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5613 - recall: 0.7575 - val_loss: 0.5318 - val_recall: 0.7443
Epoch 16/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5642 - recall: 0.7664 - val_loss: 0.5293 - val_recall: 0.7508
Epoch 17/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5506 - recall: 0.7789 - val_loss: 0.5248 - val_recall: 0.7574
Epoch 18/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5530 - recall: 0.7578 - val_loss: 0.5242 - val_recall: 0.7639
Epoch 19/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5509 - recall: 0.7667 - val_loss: 0.5214 - val_recall: 0.7672
Epoch 20/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5405 - recall: 0.7843 - val_loss: 0.5170 - val_recall: 0.7672
Epoch 21/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5418 - recall: 0.7784 - val_loss: 0.5138 - val_recall: 0.7672
Epoch 22/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5379 - recall: 0.7825 - val_loss: 0.5079 - val_recall: 0.7639
Epoch 23/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5325 - recall: 0.7761 - val_loss: 0.5057 - val_recall: 0.7607
Epoch 24/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5355 - recall: 0.7856 - val_loss: 0.5024 - val_recall: 0.7574
Epoch 25/25
88/88 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.5294 - recall: 0.7726 - val_loss: 0.5020 - val_recall: 0.7639
In [2313]:
# Time taken
print("Time taken in seconds ",end-start)
Time taken in seconds  9.655809164047241
In [2314]:
model_5.evaluate(X_train_smote,y_train_smote)
349/349 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.4889 - recall: 0.7872
Out[2314]:
[0.48466727137565613, 0.8101901412010193]
In [2315]:
model_5.evaluate(X_val,y_val)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.5060 - recall: 0.7883
Out[2315]:
[0.502034604549408, 0.7639344334602356]

Loss function

In [2316]:
# Plotting Train Loss vs Validation Loss
plot(history_5, 'loss')
In [2317]:
# Plotting Train recall vs Validation recall
plot(history_5, 'recall')
In [2318]:
y_train_pred = model_5.predict(X_train_smote)
#Predicting the results using 0.5 as the threshold
y_train_pred = (y_train_pred > 0.5)
y_train_pred
349/349 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step
Out[2318]:
array([[ True],
       [False],
       [False],
       ...,
       [False],
       [ True],
       [False]])
In [2319]:
y_val_pred = model_5.predict(X_val)
#Predicting the results using 0.5 as the threshold
y_val_pred = (y_val_pred > 0.5)
y_val_pred
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
Out[2319]:
array([[False],
       [False],
       [False],
       ...,
       [ True],
       [False],
       [False]])
In [2320]:
model_name = "NN with SMOTE,Adam & Dropout - 3 Hidden Layers [32,32,32]"

train_metric_df.loc[model_name] = recall_score(y_train_smote,y_train_pred)
valid_metric_df.loc[model_name] = recall_score(y_val,y_val_pred)
In [2321]:
results.loc[model_name]=['3','32,32,32','relu,relu,relu',epochs,batch_size,'Adam',(end-start),history_5.history["loss"][-1],history_5.history["val_loss"][-1],history_5.history["recall"][-1],history_5.history["val_recall"][-1]]
In [2322]:
results
Out[2322]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] 4 32,32,32,32 relu,tanh,relu,relu 50 128 Adam 16.678700 0.334250 0.335369 0.462833 0.495082
NN with Adam & Dropout - 2 Hidden Layers [64,32] 2 64,32, relu,tanh 25 64 Adam 11.130464 0.342465 0.332770 0.470547 0.508197
NN with SMOTE & SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 25 128 Adam 7.675946 0.665111 0.696225 0.751884 0.793443
NN with SMOTE & Adam - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.293055 0.366624 0.425441 0.831539 0.793443
NN with SMOTE,Adam & Dropout - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.655809 0.525529 0.502035 0.781665 0.763934

Classification report

In [2323]:
cr=classification_report(y_train_smote,y_train_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.79      0.73      0.76      5574
         1.0       0.75      0.81      0.78      5574

    accuracy                           0.77     11148
   macro avg       0.77      0.77      0.77     11148
weighted avg       0.77      0.77      0.77     11148

In [2324]:
#classification report
cr=classification_report(y_val, y_val_pred)  ## Complete the code to check the model's performance on the validation set
print(cr)
              precision    recall  f1-score   support

         0.0       0.92      0.73      0.82      1195
         1.0       0.42      0.76      0.54       305

    accuracy                           0.74      1500
   macro avg       0.67      0.75      0.68      1500
weighted avg       0.82      0.74      0.76      1500

Confusion matrix

In [2325]:
#Calculating the confusion matrix
make_confusion_matrix(y_train_smote, y_train_pred)
In [2326]:
#Calculating the confusion matrix
make_confusion_matrix(y_val,y_val_pred)  ## Complete the code to check the model's performance on the validation set
  • The addition of the Dropout parameter did not improve the previous model. The Recall is lower while the loss is higher.

Model Performance Comparison and Final Model Selection

In [2353]:
results
Out[2353]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer time(secs) Train_loss Valid_loss Train_Recall Valid_Recall
NN with SGD with No Hidden Layers - - - 50 7000 SGD 6.150697 1.137068 1.123595 0.486676 0.518033
NN with SGD - 1 Hidden Layer 1 14 relu 50 7000 SGD 5.692544 0.848839 0.841122 0.840813 0.809836
NN with SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 50 7000 SGD 6.878961 0.828847 0.830915 0.945302 0.963934
NN with SGD - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 7000 SGD 5.446164 0.674501 0.673391 0.314166 0.311475
NN with Adam - 2 Hidden Layers [64,32] 2 64,32 relu,relu 50 128 Adam 16.755201 0.300413 0.341212 0.528050 0.501639
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] 4 32,32,32,32 relu,tanh,relu,relu 50 128 Adam 16.678700 0.334250 0.335369 0.462833 0.495082
NN with Adam & Dropout - 2 Hidden Layers [64,32] 2 64,32, relu,tanh 50 128 Adam 16.373431 0.335301 0.330920 0.481767 0.495082
NN with SMOTE & SGD - 2 Hidden Layers [14,7] 2 14,7 relu,relu 25 128 Adam 7.675946 0.665111 0.696225 0.751884 0.793443
NN with SMOTE & Adam - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.293055 0.366624 0.425441 0.831539 0.793443
NN with SMOTE,Adam & Dropout - 3 Hidden Layers [32,32,32] 3 32,32,32 relu,relu,relu 25 128 Adam 9.655809 0.525529 0.502035 0.781665 0.763934
  • Calculate the difference is the losses for each model
In [2351]:
results.Train_loss - results.Valid_loss
Out[2351]:
0
NN with SGD with No Hidden Layers 0.013473
NN with SGD - 1 Hidden Layer 0.007717
NN with SGD - 2 Hidden Layers [14,7] -0.002068
NN with SGD - 2 Hidden Layers [64,32] 0.001110
NN with Adam - 2 Hidden Layers [64,32] -0.040799
NN with Adam & Dropout - 4 Hidden Layers [32,32,32,32] -0.001120
NN with Adam & Dropout - 2 Hidden Layers [64,32] 0.004381
NN with SMOTE & SGD - 2 Hidden Layers [14,7] -0.031114
NN with SMOTE & Adam - 3 Hidden Layers [32,32,32] -0.058817
NN with SMOTE,Adam & Dropout - 3 Hidden Layers [32,32,32] 0.023494

  • Among all models, the NN with SGD - 2 Hidden Layers [14,7] model achieved the highest training and validation scores.
  • A training Recall score of 0.945302 and a validation Recall score of 0.963934 suggests that the model is performing slightly better on the validation data compared to the training data.
  • This indicates that the model is generalizing well to unseen data.
  • We will use this model as our final model.
In [2354]:
y_test_pred = model_01.predict(X_test)
y_test_pred = (y_test_pred > 0.5)
print(y_test_pred)
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step
[[ True]
 [ True]
 [ True]
 ...
 [ True]
 [False]
 [ True]]
In [2355]:
#lets print classification report
cr=classification_report(y_test,y_test_pred)
print(cr)
              precision    recall  f1-score   support

         0.0       0.90      0.12      0.21      1194
         1.0       0.22      0.95      0.35       306

    accuracy                           0.29      1500
   macro avg       0.56      0.53      0.28      1500
weighted avg       0.76      0.29      0.24      1500

In [2356]:
#Calculating the confusion matrix
make_confusion_matrix(y_test,y_test_pred)
  • The model generalized very well on unseen data, as the test data Recall score is .95

Actionable Insights and Business Recommendations

  1. Our neural network model has a Recall score of .95, so it should do well with predicting customers that will indeed exit.

  2. Our analyses revealed that:

  • the percentage of customers that exited was the most in Germany; so the bank can try to run some marketing campaigns geared towards the German customers.
  • a higher percentage of females exited than males; so the bank can try to run some appropriate marketing campaigns geared towards getting more female customers to not exit.
  • a higher percentage of inactive members, exited, than active members; so the bank can come up with initiatives to get members active by giving incentives to inactive members to use their products like gift cards or discounts.
  • customers with ages between about 40 to about 70, exited more and with a higher median age of about 45 than customers that did not exit, at about 37; so the bank can target this age group to ensure that they retain them as customers.
  • customers with 3-4 number of products, exited more; so the bank can come up with ways to reduce the number of products customers use but make sure that they are actively using this reduced number of products.

Power Ahead